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1.0 GENERA! SUMMARY 


Thia final report representa deaign documentation and uaer documentation for 
Function Algorithma for the Maaaively Parallel Proceaaor (MPP) developed by 
Goodyear Aeroapace Corporation (GAC) under NASA contract NAS5-27610. 

The contract apecifies development of MPP aaaembler inatructions to perform 
the following functiona: 

Natural Logarithm 
Exponential (e to the x power) 

Square Root 
Sine 
Cosine 
Arctangent 

To fulfill the requirements of the contract, parallel array and scalar 
inplementations for these functions have been developed on the PDP11/3^ 
Program Development and Management Unit (PDMU) that is resident at the MPP 
testbed installation located at the NASA Goddard facility. 

1.1 REQUIREMENTS SUMMARY 


Each function was specified to perform on parallel array data located in the 
MPP Array Unit, and serial data located in the MPP Main Control Unit. 
Function Arguments and results were specified as real VAX-standard 32-bit 
floating-point format. 


Arguments to the sine and cosine functions were required to be in radians. 
Results of the arctangent functions wire to range between -pi/2 and +pi/2. 

Specifications for error conditions that were required of each function are: 


natural logarithm 

exponential 

sine 
cosine 
arctangent 
square root 


overflow on arguments outside domain of 
implementation. 

overflow on arguments outside domain of 

implementation. 

none* 

none. 

none. 

error on negative input arguments. 


Errors for the array functions were to be indicated by a logical one placed in 
an error bit plane. Errors for the MCU functions were to be indicated by 
setting error flag bits. 
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Errors for the array functions were to be indicated by a logical one 
placed in an error bitplane* Errors for the MCU functions were to be 
indicated by setting error flag bits* 

Accuracy for each function was specified as: 


natural logarithm 

exponential 

sine 

cosine 

arctangent 

square root 


results accurate to full range of 
floating point format* 

results accurate to full range of 
floating point format* 

fi digit precision to the ri^it of 
the decimal point* 

6 digit precision to the right of 
the decimal point* 

6 digit precision to the right of 
the decimal point, and one integer 
digit to the left of the decimal point. 

results accurate to full range of 
floating point format. 


The contract deliverable items are: 


- a Final Project Report containing: 

algorithm descriptions for each function, 
error conditions and their effects, 

how each function may be accessed via a Higher Order Language (HOL 
measured execution times, 

MPP implementation descriptions* 

- a VAX computer compatible tape containing all source code required 
to generate the object code for each function to be executed on 
the MPP* 
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1.2 RESULTS SUMMARY 


Each function has been developed for parallel array data, and serial Main 
Control Unit (MCU) data of the standard VAX 32-bit floating-point format. 
The array functions were ooaputed iteratively for the LNA, EXPA, SINA, 
COSA, and SINCOS subroutines. Therefore, these functions require an MCU 
subroutine portion as well as one or more PECU portions. The array 
algorithms are discussed in Chapter 2. 

All of the serial MCU functions were performed using Discrete Orthonormal 
LeGendre Polynomial (DOL) expansion. The MCU algorithms are discussed in 
Chapter 3* 

Chapter 4 describes the filename conventions, library names and user 
information to incorporate these functions into MPP applications . 


1.2.1 TIMING SUMMARY 


Although the general theory of each algorithm would apply to other 
formats, the implementation of the algorithms has been optimized for the 
VAX format. In addition, optimization of PECU code was performed to 
execute each array function in the fewest possible array cycles. The 
execution times for each function are summarized in Table 1 .0 • 

Note the addition of SINCOS which computes both the Sine and Cosine 
functions in the array. This function was made available with minimal 
effort using the required Sine and Cosine functions, and provides both 
results at a rate near the execution times of the individual subroutines 
for Sine and Cosine. 
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Table 1.0 MPP Scientific Funotion Execution Times 
ARRAY FUNCTION EXECUTION TIME (SEC x 10*«-6) 


LNA 

EXPA 

SQRTA 

SINA 

COSA 

SINCOS 

ATANA 


557.9 

416.3 

74.0 

333.7 

333.7 
347.0 

391.7 


MCU FUNCTION 


EXECUTION TIME (SEC x 10*»-6) 


LNM 

EXPM 

SQRTM 

SINM 

COSM 

ATANM 


62.7 

65.5 

55.7 

42.7 

47.8 

54.5 
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1.2.2 SUMMARRY OF ERROR CONDITIONS 


Errors In array functions ars Indicated as bits 'sat* within an error 
bitplane. The error bitplane is designated by a function argument. 
Errors in the MCU subroutines are indicated as bits 'set* within a 
speolfio MCU register. 

The error conditions are decribed in detail in the algorithm description 
of eaoh subroutine (See Table 3*0) and are summarized as follows: 


Natural Logarithm - Errors occur for negative input values. 

Exponential - Errors occur for input values with exponents 

greater than 2**7 and: 

Positive Input - Overflow indicated 
(output set to maximum VAX number). 


Square Root 
Sine .Cosine 


Arctangent 


Negative Input - Underflow indicated 
(output set to VAX f 0')« 

(For array inputs less than 2**-31» the output 
is clipped to VAX '0 1 and a status bit is set). 


- Errors occur for negative input values. 

- Errors occur for serial inputs greater than 2**2M. 
(See Section 3*2.2). 

- None 
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1.2.3 ACCURACY SUMMARY 


The array algorithms described in this report use iterative technique for 
computing each function and are accurate over the full range of VAX 
floating-point data. 

Detailed theoretical error analysis for the MCU serial algorithms is 
inoluded in Appendix A. 
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2.0 ARRAY ALGORITHMS 

Section 2.0 describes the array scientific subroutines. See Section 3.0 
for the description of the MCU subroutines. 

2.1 GENERAL DESCRIPTION OF ALGORITHMS 


2.1.1 NATURAL LOGARITHM SUBROUTINE 


The MPP array field function, AL0GA[X,Y,0,T], uses the input array 
field designated by the dummy variable X to create the output 
array field designated by the dummy variable, Y. At the row i, 
column j (isO, ... ,127; j=0,...,127) location of the X field, 
the function develops the natural log corresponding to the value 
of the element of X, namely, x(i,j), and places the result, 
y(i,j), in the same row and column location of the Y field. In a 
FORTRAN sense, the field function creates 

y(i, j)=AL0G(x(i , j) where 

i=0,...,127 and 
j=0, . . . , 127. 

On exit from the routine, the 1 bit slice field, 0, 
provides error status information. The bit slice field, 0, 
is set wherever the input, X, was negative. 


Where an X element is zero, Y will be loaded with 

the most negative VAX number possible, namely, Ymin where 

Yrains-( 1— ( 2** ( —2 4 ) ) )*(2* # (+127 ) ) . 


Where 0 is clear the output is in range; all non-zero, 
non-negative, X element values produce in-range Y element values. 

The 56 bit temporary field, Ts(t0, t 1 , t2,...., t55) specifies 
the array memory to be used for scratch purposes during function 
execution. 


FUNCTION MASKING: Unmasked 


FIELD OVERLAP: The X, Y, 0, and T fields are not permitted to be overlapped. 

INPUT VARIABLE FIELD X: 

o Number of bit slices: 32 
o X element number type: REAL 

Each X element is a VAX1 1/780 single precision floating point number 
Its characteristics follow: 
normalized 
signed magnitude 
exponent biased up by 128 
base is 2 

bit layout (from left to right): 

1 sign bit, 

8 biased exponent bits, and 
2M mantissa bits (including the suppressed 
most significant mantissa bit), 
o X units: Dimensionless 
o Bit slice notation for X: 

- Arbitrary bit slice designator: 

xO is the bit slice that holds the leftmost bits of the elements 
of X. 

x 3 1 is the bit slice that holds the rightmost bits of the elements 
of X. 

Xs(x0, xl, x2, x3,..., x31), i.e., X comprises the concatenation of 
the 32 individual bit slices. 

- Notation for sign bit slice: SX=(sxQ) s(x0) 

- Biased exponent bit slice designator: 

EX=(exO , exl,..., ex7)=(x1, x2,..., x8) 

The exponents of the elements of X are given by EX-128; the base 
for all elements is 2, 

When X=0, EX=0. 

- Mantissa bit slice designator: 

MXs(mxO, mxl,..., mx7)=(u, x9, x10,..,, x3D where 
u is implicit. At one element of X, 

u=1 when at the element location, at least one bit of X is 
non-zero and 

u=0 when at the element location, all bits of X are zero. 

When the most significant bit (MSB) slice of the mantissa field, 
namely, mxO, is stored into MPP array memory (i.e., when the 
implicit MSB slice, u, is stored), the x8 bit slice is used for 
storage. Prior to using the ex7 bit slice to store the implicit 
u slice, the contents of the ex7 bit slice are stored into the 
to bit slice. 


OUTPUT VARIABLE FIELD Y: 

o Number of bit slices: 32 
o Y element number type: REAL 

Each Y element is a VAX 11/780 single precision floating point number 
Its characteristics follow: 
normalized 
signed magnitude 
exponent biased up by 128 
base is 2 

bit layout (from left to right): 

1 sign bit, 

8 biased exponent bits, and 
24 mantissa bits (including the suppressed 
most significant mantissa bit), 
o Y units: Dimensionless 
o Bit slice notation for Y: 

- Arbitrary bit slice designator: 

yO is the bit slice that holds the leftmost bits of the elements 
of Y. 

y 3 1 is the bit slice that holds the rightmost bits of the elements 
of Y. 

Y=(y0, yl, y2, y3..... y3D, i.e., Y comprises the concatenation of 
the 32 individual bit slices. 

- Notation for sign bit slice: SY=(syO)=(yQ) 

- Biased exponent bit slice designator: 

EY=(eyO, eyl ey7)=(y1, y2,..., y8) . 

The exponents of the elements of Y are given by EY-128; the base 
for all elements is 2. 

When Y=0 , EY=0. 

- Mantissa bit slice designator: 

MYs(rayO, myl,..., my7)=(u, y9, y10,..., y31) where 
u is implicit. At one element of Y, 

u=1 when at the element location, at least one bit of Y is 
non-zero and 

u=0 when at the element location, all bits of Y are zero. 

When the most significant bit (MSB) slice of the mantissa field, 
namely, myO, is stored into MPP array memory (i.e., when the 
implicit MSB slice, u, is stored), the y8 bit slice is used for 
storage. Prior to using the ey7 bit slice to store the implicit 
u slice, the contents of the ey7 bit slice are stored into the 
tO bit slice. 

r 

K 1 

OUT-OF-RANGE Y ELEMENTS: 
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The range of y for the ALOG function is minus infinity to plus infinity, 
(As will be seen, y values in only a very small fraction of the y range 
are generated.) The domain of x corresponding to the y range extends 
from xsO through x=+infinity. 

Negative X element values imply a complex Y element output. C is loaded 
with (1,1) ir. such cases. Where an X element value is zero, a Y element 
value of minus infinity is implied; in such case, the 0 field is loaded 
with (0,1) and the Y element is loaded with 

Ymins-( 1-(2»«<-24) ) )»(2«»(+127) ) . 

The next X element larger than zero that can be expressed 
using the specified number form is 

Xsraa) 1=( 1+(2 ## (-23) ) )*(2**(— 1 29 ) ) ; 

it causes the smallest possible signed Y element value, Ysmall, that 
is derived from an X element value. It is given by 

YsmallsLN(Xsraall)s-89. 4159861839 . 

The very largest X element value, Xlarge,- causes the largest possible* 
signed Y element. Specifically, 

Ylarge=LN( Xlarge )=+88. 02969 18823 where 
Xlarge=( 1-(2**(-24) ) )*(2**(+127) ) . 

For all X element values in the domain extending from Xsmall through 
Xlarge, 0 will be loaded with 0. 


ALGORITHM DEVELOPMENT; 

For each element of X (denoted x), y=AL0G(x) must be computed for all 
in-range x values. All positive x values are considered in-range; 
however, the case of x=0 must be treated specially. 

The issue of computing y will be addressed first. An in-range x will be 
assumed. Then the issue of determining whether or not x is in-range and 
the special case of x=0 will be addressed. 

The starting expression is 

1) y=LN(x) where LN(x) is the natural log of x. 

The inverse form of 1) is 

2) x*e t# y where 6=2.718281828 . 

Now e can be expressed as 
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3) es2**a where as 1/LN(2) = 1 .4426950407 
Using 3) in 2), 

4) xs2»*(a»y) . 

Now, taking the logarithm of x, base 2 (i.e., L0G2(x)), of 4), 

5) ys(LN(2) )*L0G2(x) 


r 


The independent input variable x is a floating point number and so 
is expressed as 

6) xsS*(f*(2**N) ) where 

f is a fraction having a value less than 1 but greater than 

or equal to 0.5 that has the number form (0.0.24), 

N is an integer having a value less than 8 but greater than 

or equal to -128 that has the number form (1.7.0), and 
S is +1 if x is positive and is -1 if x is negative. 

But x is positive only (S=+1), and so 

7) x=( f*(2**N) ) . 

Using 7) in 5) , 

8) y=(LN(2) )*L0G2( f*(2 # *N) ) which reduces to 

9) y=(LN(2) )*(N-1+L0G2(2*f ) ) 

Let 

10) g=2*f where 

gMIN=1 and 

gMAX=2*( 1-(2**(-24) ) ) . 

Using 10) in 9) , 

11) y=(LN(2) )*(N-1+L0G2(g) ) or, 

alternatively, as 

12) y=( (LN(2) )*N)+zz where 
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zz=( (LN(2) ) »L0G2(g) )-LN(2) . 

The primary task needed to be performed to compute y is that of 
generating L0G2(g). Let 

13) z=L0G2(g) where 

zHINsO and 

zMAXs1+L0G2( 1-(2**(-24) )) (if no guard bits are used). 

But g can be expressed as the product 

14) g=a0*a1»a2*a3«....»aM* »a24»a25»a26«a27« where, 

in BINARY, 

a0=1,10. 
a 1 = 1 , 1.1 
a2= 1 , 1.01 
a3=1, 1.001 


e 

aM=1 , 1+(2* # (-M) ) 


( for all M ) 


a24s1 , 1.000000000000000000000001 
a25= 1 , 1 . 000000000000000000000000 1 
a26=1 , 1 .00000000000000000000000001 
a27= 1 , 1 . 00000000000000000000000000 1 


Each "a" value can assume the value of 1 or the non-one value (to the 
right of the comma in the list above). Either "a" value can be written 
as 2 to some power. In particular, 

15) aM=2 ## uM . 


As a result, 14) can be written as 

16) g=(2**u0)»(2»*u1)«(2**u2)*...*(2»»uM)*...«(2**u24)»(2»»u25)*(2**u26)*... 
or as 

17) g=2**(u0+u1+u2+u3+. . . ,+uM+. . . ,+u24+u25+u26+u27+. . . . ) 
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Substituting 17) into 13) shows that 

18) Z=u0+u1+u2+u3+. . . .+uM+. . . .+u24+u25+u26+u27+. . . . 

When aMsI, uM=0. The list of non-zero n u w values (corresponding to "a" 
values not equal to 1) is provided in Table 2.1.1 . 

An iterative approach is used to find z. Assume that iteration (M-1) 
has just taken place; an (M-1) iteration n g" value, g[M-1], as well as 
an (M-1) iteration M z" value, z[M-1], have just been developed. 

To accomplish the next iteration M values, g[M] and z[M], the following 
expressions are used: 

19) g[M]=g[M-1]*aM where 

g[M] =g[M-1 ]+SHFT(g[M-1 ] ,-M) and TEST=1 
g[Mj=g[M-1] and TEST=0 

Also, 

20) z[M]=z[M-1]+uM where 

uMsLN(aM)/LN(2) when TEST=1 

uM=0 when TEST=0 

The iterations begin at M=1. For the 1st iteration, 

g[0]=1 and z[0]=0 . 

Using the expressions 19) and 20), z can be determined to any level 
of precision. Using the z determined in 13) and then in 11), 
y=AL0G(x) is determined. 

A slightly more efficient way to compute y results by multiplying the 
uM values with LN(2) prior to performing the iteration operations. 
Substituting z from 13) into the auxiliary expression of 12) yields 

21) zz=LN(2)*z-LN(2) . 

Using 18) in 21) yields, 

22) zz=(LN(2) )*(u0+u1+u2+u3+. . . .+uM+. . . ,+u24+u25+u26+u27+. . . .)-LN(2) 

Let 

23) vMs(LN(2) )*(uM) for all M. Using 23) in 22), 


when aM*g[M-1]<g or 
when aM*g[M-1]=g ; else, 

when aM # g[M-1]>g . 


; else. 
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24) zz=(v0+v1+v2+v3+....+vM+....+v24+v25+v26+v27+....)-LN(2) 

(See Table 2.1.1 for vM values.) 

As for z, an iterative approach can be used to find zz. Assume that 
iteration (M-1) has just taken place; an (M-1) iteration "g H value, 
g[M-1], as well as an (M-1) iteration "zz" value, zz[M-1], have just 
been developed. To accomplish the next iteration M values, g[M] and 
zz[MJ, the following expressions are used: 

25) g[M]=g[M-1]*aM where 

g[M]sg[M-1]+SHFT(g[M-1 ] ,-M) and TESTzl when aM*g[M-1]<g 

when aM*g[M-1]=g 

g[M]sg[M-1] and TEST=0 when aM*g[M-1]>g 

Also, 

26) zz[M]=zz[M-1 J+vM where 

vM=LN(aM) when TEST=1 

else; vM=0 when TEST=0 

The iterations begin at M=1. For the 1st iteration, 

g[0]=1 and 

zz[0] s-LN(2) . 

Using the zz[M] of 26) for zz in 12) permits y to be determined using an 
economical common times a narrow 8 bit field multiply. 


or 

; else. 
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Table 2.1.1 - Values Of uM And vM Corresponding To Different aM Values 


M 

aMs1+(.5 Wil M) 

uM=L0G2(aM) 

vM=LN(2)*uM 

0 

2 

.99999999999 

.693147180644 

1 

1.5 

.584962500792 

.405465108211 

2 

1 .25 

. 321928094818 

.223143551296 

3 

1.125 

.169925001262 

.117783035547 

4 

1.0625 

.087462841164 

.060624621765 

5 

1.03125 

.044394119491 

.030771658763 

6 

1.015625 

.022367812829 

.0155041864 

7 

1.0078125 

.011227255583 

.00778214055402 

8 

1.00390625 

.00562454912855 

.00389864037089 

9 

1.001953125 

.00281501548714 

.0019512200484 

10 

1.0009765625 

.00140819422001 

.00097608585341 

1 1 

1.0004882812 

.00070426868758 

.00048816185522 

12 

1.0002441406 

.00035217719666 

.00024411063096 

13 

1.0001220703 

.00017609939459 

.00012206279888 

14 

1.0000610351 

.0000880523548428 

.000061033241509 

15 

1.0000305175 

.000044026841807 

.0000305170812715 

16 

1.0000152587 

.0000220134209034 

.0000152585406357 

17 

1.0000076293 

.0000110068765481 

.0000076293854471 

18 

1.0000038146 

.0000055034382739 

.0000038146927235 

19 

1.0000019073 

.0000027515530405 

.0000019072312325 

20 

1.0000009536 

.0000013756104239 

9.5350048701 IE-7 

21 

1.0000004768 

6.87639 115553E-7 

4.766351 14251E-7 

22 

1.0000002384 

3.44151750584E-7 

2.38547815634E-7 

23 

1.0000001192 

1 .72075875292E-7 

1.1 92739078 17E-7 

24 

1 .0000000596 

8.6037937645E-8 

5.9636953908E-8 

25 

1.0000000298 

4.2852872417E-8 

2.9703347699E-8 

26 

1.0000000149 

2.1 5925326 13E-8 

1.4966803104E-8 

27 

1.0000000074 

1.0630169902E-8 

7.3682722976E-9 

28 

1.0000000037 

4.98289214169E-9 

3- 4538776395E— 9 

29 

1 .0000000018 

2.32534966612E-9 

1.6 11809565 IE-9 

30 

1 .0000000009 

1.32877123778E-9 

9.210340372E-10 

31 

1.0000000004 

6.6438561889E-10 

4.605170186E-10 

32 

1 .0000000002 

0 

0 


- 2-9 


MPP SCIENTIFIC SUBROUTINES 


GOODYEAR AEROSPACE 
CORPORATION 
GER- 17221 


2.1.2 EXPONENTIAL ARRAY SUBROUTINE 


DESCRIPTION: The MPP array field function, EXPA(X,Y t O,T) , uses the input array 
field designated by the dummy variable X to create the output 
array field designated by the dummy variable, Y. At the row i, 
column j (i=0, . . . , 127; j=0,...,127) location of the X field, 
the function exponentiates the value of the element of X, 
namely, x(i,j), and places the result, y(i,j), in the same row 
and column location of the Y field. In a FORTRAN sense, the 
field function creates 

y(i, j)sEXP(x(i, j)) where 

i=0,...,127 and 
j = 0 , * • . , 1 27 . 

On exit from the routine, the field, 0=(o0, ol o2) , provides 
out-of-range status information as follows: 

00 - set where the output Y was declared equal to VAX ’ 1 ’ 

because the input X was less than 2**-31. or nearly zero. 

01 - set where the output was too small to be represented in VAX 

format because the input X was less than -2**7; 
the output Y will be cleared to all 0's for this case. 

02 - set where overflow has occurred because the input X was 

greater than 2**7; the maximun VAX number will be 
inserted in Y. 


The 128 bit temporary field, T=(tO, tl, t 2,...., t127), specifies 
the array memory to be used for scratch purposes during function 
execution. 

FUNCTION MASKING: Unmasked 

FIELD OVERLAP: The X, Y, 0, and T fields are not permitted to be overlapped. 

INPUT VARIABLE FIELD X: 

o Number of bit slices: 32 
o X element number type: REAL 

Each X element is a VAX1 1/780 single precision floating point number 
Its characteristics follow: 
normalized 
signed magnitude 
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exponent biased up by 128 
base is 2 

bit layout (from left to right): 

1 sign bit, 

8 biased exponent bits, and 
24 mantissa bits (including the suppressed 
most significant mantissa bit), 
o X units: Dimensionless 
o Bit slice notation for X: 

- Arbitrary bit slice designator: 

xO is the bit slice that holds the leftmost bits of the elements 
of X. 

x 3 1 is the bit slice that holds the rightmost bits of the elements 
of X. 

X=(xO, xl, x2, x3,..., x31), i.e., X comprises the concatenation of 
the 32 individual bit slices. 

- Notation for sign bit slice: SX=(sxO)=(xO) 

- Biased exponent bit slice designator: 

EXs(exO, exl,..., ex7)=(x1, x2,..., x8) 

The exponents of the elements of X are given by EX-128; the base 
for all elements is 2. 

When X=0, EX=0. 

- Mantissa bit slice designator: 

MX=(mxO, raxl,..., mx7)=(u, x9, xIO,..., x3D where 
u is implicit. At one element of X, 

usl when at the element location, at least one bit of X is 
non-zero and 

u=0 when at the element location, all bits of X are zero. 

When the most significant bit (MSB) slice of the mantissa field, 
namely, mxO, is stored into MPP array memory (i.e., when the 
implicit MSB slice, u, is stored), the x8 bit slice is used for 
storage. Prior to using the ex7 bit slice to store the implicit 
u slice, the contents of the ex7 bit slice are stored into the 
tO bit slice. 


OUTPUT VARIABLE FIELD Y: 

o Number of bit slices: 32 
o Y element number type: REAL 

Each Y element is a VAX1 1/780 single precision floating point number 
Its characteristics follow: 
normalized 
signed magnitude 
exponent biased up by 128 
base is 2 

bit layout (from left to right): 
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1 sign bit, 

8 biased exponent bits, and 
24 mantissa bits (including the suppressed 
most significant mantissa bit), 
o Y units: Dimensionless 
o Bit slice notation for Y: 

- Arbitrary bit slice designator: 

yO is the bit slice that holds the leftmost bits of the elements 
of Y. 

y3 1 is the bit slice that holds the rightmost bits of the elements 
of Y. 

Y=(yO, yl, y2, y3,..., y3D. i.e., Y comprises the concatenation of 
the 32 individual bit slices. 

- Notation for sign bit slice: SY=(syO)s(yO) 

- Biased exponent bit slice designator: 

EY=(eyO, eyl,..., ey7)=(yl, y2 y8) . 

The exponents of the elements of Y are given by EY-128; the base 
for all elements is 2. 

When Y=0 , EY=0. 

- Mantissa bit slice designator: 

MY=(myO, myl,..., my7)=(u, y9, ylO,..., y31) where 
u is implicit. At one element of Y, 

u=1 when at the element location, at least one bit of Y is 
non-zero and 

u=0 when at the element location, all bits of Y are zero. 

• 

When the most significant bit (MSB) slice of the mantissa field, 
namely, myO, is stored into MPP array memory (i.e., when the 
implicit MSB slice, u, is stored), the y8 bit slice is used for 
storage. Prior to using the ey7 bit slice to store the implicit 
u slice, the contents of the ey7 bit slice are stored into the 
tO bit slice. 



OUT-OF-RANGE Y ELEMENTS: 

Assuming real values for the X elements the complete range of y for the 
EXPA function is 0 through plus infinity. Only a countable number of 
X and Y element can be represented by the specified floating point numbers. 
In particular, the smallest magnitude non-zero Y element value that can be 
represented using the VAX floating point form will be designated Ysmall; 
the largest Y element value that can be represented using the VAX floating 
point form will be designated Ylarge. Ysmall and Ylarge are given by 

Y smalls ( 1+(2**(-23) ))*(2**(-129) ) and 

Ylarges( 1— ( 2**( — 24) ) )*(2**(+127) ) .respectively. 

Y element values that lie between 0 and Ysmall must be described as 
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0 or Ysmall. It is reasonable to assert that a Y element with a value 
that lies in the upper half of the 0 to Ysmall interval, i.e., between 
Ymins(l+(2 # *(-24)))*(2**(-129)) and Ysmall will be assigned the value 
Ysmall. A Y element value smaller than Ymin (in the lower half of the 
0 to Ysmall interval) will be set to 0 and the ol bit slice will be 
set (i,e., 0=(0,1)). The X element value corresponding to Ymin is 

XminsLN( Ymin) =-89. 4 159862435. 

Y element values that lie above Ylarge but are smaller than the value 
Ymax=( M2**(-25))) # (2**(+127) ) will be assigned the value Ylarge. 

Where Y element values lie above Ymax, the overflow out-of-range bit 
will be set in the oO bit slice (i.e., 0=0,0)). The X element value 
corresponding to Ymax is 

Xmax =LN ( Ymax ) =+88 . 02969 1 9 1 2 

For Y element values that lie fron Ymin through Ymax, 0=(0,0). 

ALGORITHM DEVELOPMENT: 

For each element of X (denoted x), y=EXP(x) mrst be computed for all 
in-range x values. In-range x values are larger than or equal to 
Xmin and are smaller than or equal to Xmax . 

The issue of computing y will be addressed first. An in-range x will be 
assumed. Then the issue of determining whether or not x is in-range will 
be addressed. 

The starting expression is 

1) y=e**x where e=2. 718281828 

Now e can be expressed as 

2) e=2**a where a=1/LN(2)s1 .4426950407 

Using 2) in 1), 

3) y=2**(a*x) . 

The independent input variable x is a floating point number and so 
is expressed as 

4) x=S*(f»(2*»N)) where 

f is a fraction having a value less than 1 but greater than 

or equal to 0.5 that has the number form (0.0.24), 

N is an integer having a value less than 8 but greater than 

or equal to -128 that has the number form (1.7*0), and 
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S is +1 if x is positive and is -1 if x is negative. 

Using 4) in 3), 

5) ys2 # *(S*(a*f )*(2**N) ) 

Let 

6) bsa*f where 

bMAXs 1.4426950407 

bMIN* .72134752035 

Using 6) in 5) , 

7) y=2 ## (S*b*(2**N) ) 

or since (2**N) simply causes a shift of the radix point of b, 

8) y=2**(S*SHFT(b,N) ) where 

SHFT(b.N) implies a shift of the radix point of b equal to the 
magnitude of N, to the right if the sign of N is + and to the 
left if the sign of N is -. 


As an aside, at the low end of the range of X (Xmins-89. 4159862435) , 

9) bsLN( Ymin)/( 128»LN(2) )s+1 .0078124994, 

N=7, and 
S=-1 . 

At the top end of the range of X (Xraax=+88. 029691912) , 

10) bsLN(Ymax)/( 128*LN(2) )=+. 992187499663, 

N=7, and 
S=+1 . 


Once SHFT(b,N) has been developed, it can be written as the sum of an 
integer part, I, (+ or -) and an always + fractional part, g, i.e., as 

11) I+g=SHFT(b,N) 


Then, using 11) in 8), 



HPP SCIENTIFIC SUBROUTINES 


GOODYEAR AEROSPACE 
CORPORATION 
GER-17221 


12) ys2 ## (I+g)*(2 i# I) # (2 ,H, g)*(2 ## (I+1) )*( (2* # g)/2) . 

The integer variable (1+1) is the unbiased exponent of the output 
y value. The mantissa of the output is given by ((2**g)/2). Thus, 
the primary task to be performed is that of generating 2**g. Let 

13) z*2**g where 

gMINsO and 

gMAXs 1-(2**(-24) ) (if no guard bits are used). 

But z can be expressed as the product 

14) z=a0*a1»a2*a3*....*aM*....*a24*a25*a26»a27* where, 

in BINARY, 

a0=1 ,10. 
a1=1 , 1.1 
a2=1 , 1.01 

a3*1, 1.001 


e 

aM*1, 1+(2* # (-M) ) ( for all M ) 


# 

a24s 1 , 1 .00000000000000000000000 1 
a25* 1 , 1 .000000000000000000000000 1 
a26= 1 , 1 . 0000000000000000000000000 1 
a27»1 , 1 .000000000000000000000000001 


Each "a” value can assume the value of 1 or the non-one value (to the 
right of the comma in the list above). Either "a" value can be written 
as 2 to some power. In particular, 

15) aM=2»»uM . 

As a result, 14) can be written as 

16) z=(2»»u0)*(2**u1)*(2* <, u2)*...»(2»«uH)*...»(2‘»»u24)»(2»»u25)*(2»»u26)«... 
or as 
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17) zs2 ## (uO+uUu2+u3*. . . .♦uM+. . . .♦u2^u25*»-u26+u27+. . . . ) 

Comparing 17) to 13) shows that 

18) g«u0+u1+u2+u3+. . . ,+uM+. . . .♦u24+u25+u26+u27+. . . . 

When aM* 1 , uMsO. The list of non- 2 ero "u" values (corresponding to M a" 
values not equal to 1) is provided in Table 2.1.2 . 

An iterative approach is used to find z. Assume that iteration (M-1) 
has just taken place; an (M-1) iteration "g" value, g[M-1], as well as 
an (M-1) iteration "z" value, z[M-1], have Just been developed. 

To accomplish the next iteration M values, g[M] and z[M], the following 
expressions are used: 

19) g(MJsg[M-1]-uM where 

uM=0 when uM>g[M-1] ; else, 

uM=LN(aM)/LN(2) when uM<g[M-1] or 

when uMsg[M-1] 


20) z[M]azCM-1]»aM or 

sz(M-1 ] when uM=0 ; else, 

=z[M-1]+SHFT(z[M-1] ,-M) when uM=g[M-1] 

The iterations begin at M=1. For the 1st iteration, 

g[0]sg (the g of expression 13)) and 
z[0]=1 . 

Using the expressions 19) and 20), z can be determined to any level 
of precision. By dividing z by 2 (by shifting the radix point of z 
left by 1 bit position), the mantissa of ysEXP(x) is determined. 
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Table 2.1.2 - Values Of uM Corresponding To Different aM Values 


M aM«H(.5 #t M) uM*L0G2(aM) 


0 

2 

1.000000000000 

1 

1.5 

.584962500792 

2 

1.25 

.321928094818 

3 

1.125 

.169925001262 

4 

1.0625 

. 08746284 1164 

5 

1.03125 

.044394119491 

6 

1.015625 

.022367812829 

7 

1.0078125 

.011227255583 

8 

1.00390625 

.00562454912855 

9 

1.001953125 

.00281501548714 

10 

1 .0009765625 

.00140819422001 

11 

1.0004882812 

.00070426868758 

12 

1.0002441406 

.00035217719666 

13 

1.0001220703 

.00017609939459 

14 

1.0000610351 

.0000880523548428 

15 

1.0000305175 

.000044026841807 

16 

1.0000152587 

.0000220134209034 

17 

1.0000076293 

.0000110068765481 

18 

1.0000038146 

.0000055034382739 

19 

1.0000019073 

.0000027515530405 

20 

1.0000009536 

.0000013756104239 

21 

1.0000004768 

6.876391 15553E-7 

22 

1.0000002384 

3.4415 1750584E-7 

23 

1.0000001192 

1.72075875292E-7 

24 

1.0000000596 

8.6037937645E-8 

25 

1.0000000298 

4.2852872417E-8 

26 

1.0000000149 

2. 15925326 13E-8 

27 

1.0000000074 

1.0630169902E-8 

28 

1.0000000037 

4.98289214169E-9 

29 

1.0000000018 

2.32534966612E-9 

30 

1.0000000009 

1, 32877 123778E-9 

31 

1.0000000004 

6.6438561689E-10 

32 

1.0000000002 

0 
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2.1.3 SQUARE ROOT ARRAY SUBROUTINE 


DESCRIPTION ; 

SQRTV is a PECU routine that computes the square root of an array (X) and 
places the result in another array (Q). The entry point is SQRTVI. Arrays X 
and Q each contain 32-bit floating-point numbers in VAX-F format. The routine 
requires a 22-plane array for temporary storage (T). No error occurs as long 
as X is non-negative. The sign of X is stored in an error bit plane (E). 

If an element of X is positive then its value is: 

(0.5 + x2/4 ♦ x3/8 ♦ ... ^ x24/(2**24) ) • 

2«*(e0 + 2»e1 + 4»e2 ♦ ... ♦ 128*e7 - 128) 

where x2, x3, ..., x24 are the fraction bits and eO, el, .... e7 are the 

characteristic bits of the element. Similarly, its square root in Q has the 
value: 

(0.5 ♦ q2/4 ♦ q3/8 ♦ ... ♦ q24/(2»»24)) • 

2»»(y0 ♦ 2*y1 ♦ 4»y2 ♦ ... ♦ 1?3«y7 - 128) 

where q2, q3, .... q24 are the fraction bits and yO, yl y7 are the 

characteristic bits. 

If eO * 1, then the X-fraction should be shifted right once and unity added to 
the X-charaeteristic. This puts all X-fractions in the range of 0.25 to 1 and 
makes all X-exponents even. Then the Q-exponent is half the X-exponent and 
the Q-fraction is the square root of the X-fraction. When we take into 
account the characteristic bias of 128 we obtain the following binary addition 
for the Y-characteristic: 


0 e7 e6 e5 e4 e3 e2 el 
♦ 0100000 eO 


y7 y6 y5 y4 y3 y2 yl yO 
This arithmetic is performed near the end of the routine. 

After shifting where eO » 1, the X-fr action is in the range of 0.25 to 1 so 
the Q-fraction is in the range of 0.5 to 1 and is automatically normalized. 
The fraotion bits of Q are computed in the order q2, q3, q4, .... q24. 
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Bit q2 is 1 if and only if the Q-fraction value is 0.75 or more; that is, if 
and only if the X-fraction value is 0.5625 or more (0.5625 = 0.75 * 0.75). 
The initial part of the routine computes q2 by setting the shift register 
lengths to 26, loading the X-fraction into the shift register, adding 0.5 for 
the hidden bit, shifting right one place where eO : 1, and subtracting 0.0625. 
The result will be 0.5 or more where and only where q2 : 1 so the result bit 
with weight 0.5 is stored as q2. Where eO s 0, the binary arithmetic looks 
like: 



0 

x2 

x3 

x4 

x5 • • • 

x23 

x24 

0 

+ 

0 

1 

1 

1 

0 ... 

0 

0 

0 


rl 

r2 

r3 

r4 

r5 ... 

r23 

r24 

r25 

and where eO = 1 , the 

binary arithmetic looks 

like: 




0 

0 

x2 

x3 

x4 ... 

x22 

x23 

x24 

+ 

0 

0 

1 

1 

3 ... 

0 

0 

0 


rl 

r2 

r3 

r4 

r5 ... 

r23 

r24 

r25 


Bit rl of the result is stored as q2 and the other result bits are used in the 
computation of the other Q fraction bits. 

To compute bits q3 through q24 the rc; v ine enters the main loop (starting at 
MAIN2). For j : 2, 3, ... assume that bits q2, q3, .... q(j-1) have been 
determined and bit qj is now being computed. Let Q(j) equal the value of the 
Q-fraction if qj is replaced by 1 and qk is replaced by 0 for all k > j. 
Thus: 


Q(2) = 1/2 ♦ 1/4 

Q(3) = 1/2 ♦ q2/4 + 1/8 

Q(4) = 1/2 + q2/4 ♦ q3/8 + 1/16 

Q(24) = 1/2 + q2/4 + q3/8 ♦ ... + q23/(2»»23) + 1/(2«»24) 

Bit qj s 1 if and only the value of the X-fraction is Q(j)*Q(j) or more. For 

j = 2, 3. ... we define: 

R( j) = 1/2 + (2**< j-2) ) * (X - Q( j)*Q( j) ) 

With this definition, bit qj = 1 if and only if R(j) is 1/2 or more. Note 
that when the routine enters the main loop at MAIN2, the value in the shift 
register equals R(2). Iteration j-1 of the main loop first stores a bit of 
R( j) into qj (it stores the bit with weight 1/2). Then it calculates R(j+1) 
from R( j) . From the definition of R(j) we have: 

R( j+1 ) s 1/2 + (2**(j-D) • (X - Q( j+1 )*Q( j+1 ) ) 

2 * R( J) * 1 + (2**(J-1) ) * (X - Q( j)*Q( j) ) 
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So: 


R( J+1) = 2»R(j) - 1/2 + (2**( j-1) )*(Q(j) # Q(j) - Q( j+1 )*Q( >1 ) ) 

ROD * 2»R(j) - 1/2 + (2**(j-1))*(Q(j) - Q( j+1 ) >*(Q( j) + Q( j+1) ) 

But Q ( j ) - QC j+1) * (1 - 2*qj)/(2* # ( j+1) ) • Q(j) + Q( j+1) = 2«Q(j) + (2«qj - 

1 )/(2**( j+1 ) ) . and (2»qj - 1)»(1 - 2»qj) = -1 so: 

R( j+1 ) * 2*R( j) -1/2 + ( 1-2*qj)*Q( j)/2 - 1/(2*»( j+3) ) 

Where qj = 0, we obtain the following binary addition for R(j+1): 

2 • R(j) : r2 r3 r4 r5 ... rj r(j+1) r(j+2) r(j+3) r(j+4) 

+ 1 1 q2 q3 ... q(j-2) q(j-D 0 1 1 


R ( j+1 ) : rl r2 r3 r4 ... r(j-1) rj r(j+1) r(j+2) r(j+3) 

and where q j = 1 , we obtain the following binary addition for R(j+1): 

2 * R(j): r2 r3 r4 r5 ... rj r(j+1) r(j+2) r(j+3) r(j+4) 


+ 0 0 q2 q3 ... qC j-2) q(j-1) 0 1 1 


R( j+1): rl r2 r3 r4 ... r(j-1) rj r(j+1) r(j+2) r(j+3) 

The main loop will recirculate 2 # R(j) through the A-register while 
constructing the addend in the P-register, and performing the addition to get 
R< j+1 ) in the B-register and the shift register. To ease the construction of 
the addend in the P-register, bits t2 through t23 are built up in the 22-plane 
temporary storage array, where t2 = 1 ® q2 and ti = q(i-1) fi qi for i in the 
range of 3 to 23. When rl of the previous addition is moved from the 
B-register to qj, the P-register contains the complement of q(j-1), so a 
simple logic operation will form tj in the P-register. Bit tj is stored in 
the following cycle and P is set to 1 (R(j) is also shifted one place and 0 
added with SHIFT A and HALFADD operations). Then R(j) is re-circulated 23— j 
times with 0's added to bring bit r(j+4) into the A-register. Two cycles of 
SHIFT A and FULLADD add I's to bits r(j+4) and r ( j+3) . The next cycle does a 
SHIFT A and HALF ADD (to add 0 to bit r(j+2)) and loads tj = q ( j— 1 ) 9 qj into 
the P-register (this is the correct addend to bit r(j+1)). Then j-2 addition 
cycles are performed while P is exclusive-or 'ed with successive t bits (this 
puts the correct addend in P whether qj s 0 or qj = 1). 

When the main loop is finished (END1), bit q24 is stored and another step 
performed to see if the round-bit (q25) is 0 or 1. The shift register length 
is set to 30 and the round-bit is added to the q bits to obtain the final 
fraction in the shift register. Then the characteristic of the result is 
computed (described above) and stored in the shift register. 
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If the X-characteristic equals 0, then X s 0 and the result equals 0. The 
G-register is cleared wherever X a 0 Then the shift register is stored in the 
result array where G : 1, and 0's are stored where G s 0. 
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2.1.4 SINE and COSINE ARRAY SUBROUTINE 


DESCRIPTION : 

The MCL mnemonics using these functions are SINA, COSA, and SINCOSA. The MCL 
statement: 


SINA ang, sin, [temp] 

generates an array of sines (sin) from an array of angles (ang). The MCL 
statement: 


COSA ang, cos, [temp] 

generates an array of cosines (cos) from an array of angles (ang). The MCL 
statement: 


SINCOSA ang, sin, cos, [temp] 

generates an array of sines (sin) and an array of cosines (cos) from an array 
of angles (ang). Arrays ang, sin, and cos are in the 32-bit VAX-F 
floating-point format. Angles are in radians. The optional parameter, temp, 
is used to specify the location of a 90-plane array for temporary storage. If 
temp is not specified then planes 884 through 973 are used. 

Each mnemonic first calls a PECU routine (VFSC1$) to convert the angle array 
to a fixed-point format. VFSC1$ adjusts the angles to lie in the first 
quadrant (0 to 90 degrees), leaves them in the planar shift registers, and 
initializes three planes, Z, COSSGN, and SINSGN, in temporary array storage. 

Then each mnemonic calls a PECU routine, VFSC2$, twelve times to adjust the 
angles and leave them close to 45 degrees. 

Then VFSC3$ is called to generate the sine and cosine of the angles close to 
45 degrees and leave them in SNA and CS, respectively. 

Then VFSC4$ is called twelve times to adjust the sine and cosine arrays until 
they are fixed-point versions of the desired results. 

Finally, each mnemonic calls VFSC5$ to float the results and store them in the 
result array(s), sin and/or cos. SINCOSA makes two calls to float both the 
sine and cosine arrays. SINA and COSA each make only one call to float the 
desired result. 
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TEMPORARY ARRAY STORAGE 

These functions use 90 planes of temporary storage (planes 884 through 973 as 
a default). Its layout is shown below: 


1 1 1 


1 1 


1 


1 

t 1 1 

: z : cos i 

CS 

i SIN j 

SNA 

1 

1 

1 

SNB 

1 

1 

1 

i : SGN i 

1 1 1 


: sgn : 
1 1 


1 

1 

1 


1 

1 

1 

1 t 1 

r 2Q 

1 1 

i~y) - 

1 

M WM ■ 

1 

— + 


The COSSGN plane contains the sign of the cosine while the SINSGN plane 
contains the sign of the sine. The CS array contains the fixed-point version 
of the cosine magnitude; the MSB of CS has a weight of unity and the LSB of 
CS has a weight of 2**(-28). The SNA array contains the fixed-point version 
of the sine magnitude with the same scaling as CS. The SNB array is an 
alternate copy of SNA. The Z-plane shows where the angle sense is reversed - 
the value of Z depends on the quadrant containing the angle as shown below: 


Quadrant Degrees Z 


+• 

1 

t 

First 

1 

1 

0 

to 

90 ; 

0 

-+ 

1 

1 

1 

1 

Second 

i 

1 

90 

to 

180 J 

1 

1 

1 

1 

1 

Third 

1 

« 

180 

to 

270 { 

0 

1 

1 

1 

1 

Fourth 

1 

1 

270 

to 

360 ; 

1 

1 

1 

+« 





■- — 
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VFSCIg ROUTINE 

The VFSC1$ PECU routine converts a 32-bit VAX-F array of angles to fixed-point 
and leaves the result in the planar shift registers. The routine also 
initializes Z, COSSGN, and SINSGN. 

The input angle array is in radians. If the angle magnitude is larger than a 
revolution (2*pi radians), the routine should subtract off any integral 
multiple of revolutions to leave an angle with a magnitude less than a 
revolution. The easiest way to accomplish this is to divide the angle by 2*pi 
and only treat the fractional part of the quotient. Another advantage of this 
scaling is that the left-most pair of bits of the fractional part show which 
quadrant contains the angle. 

The division by 2 # pi is accomplished by multiplying the angle by 1/(2*pi). 
Let the angle be: 


E-128 

♦ 2 * F 
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where E is the 8-bit characteristic of the angle and F is the 23-bit fraction 
of the angle (plus 0.5 for the hidden bit). Since 1/(2*pi) s 0.25 * 2/pi the 
product is: 


E-130 

+2 * (F • 2/pi) 

We multiply the fraction, F, by 2/pi. Since F < 1, the product is less than 
2/pi s 0.6366 and the MSB of the product has a weight of 0.5 revolutions. 
Where E < 130 the product is shifted right 130-E places with zero bits 
inserted at the left end. Where E > 1 30 the product is shifted left E-130 
places while discarding all bits shifted off the left end. Where E s 130 the 
product is not shifted. 

The constant, 2/pi, equals 0.1010 0010 1111 1001 1000 0011 0111 in binary with 
a relative error of 1/1613825000. If we allow negative binary digits (4 
meaning -1), the constant can be written as 0.1010 0104 0000 4010 4000 0100 
4004. Thus, F * 2/pi can be obtained with 10 additions and subtractions of F 
shifted appropriately: 

F * 2/pi = F/2 + F/8 + F/64 - F/256 - F/8192 + F/32768 - F/131072 
+ F/(2**22) - F/<2»*25) - F/(2**28) 

The product, F # 2/pi, is computed with a precision of 31 bits; the MSB Ms a 
weight of 0.5 revolution and the LSB has a weight of 2**(-31) revolutions. 
The product is left in register A, register B, and the 30-bit-long planar 
shift register. 

If E <99 then the product should be right-shifted 32 or more places and the 
result will be zero. This will occur for any angle magnitude less than 
2** (—30) radians. 

If E > 161 then the product should be left-shifted 32 or more places. All 
fraction bits of the product will be shifted off the left end to leave a 
result of zero. This will occur for any angle magnitude greater than 2**33 
radians. Such large angles should never occur in any reasonable application. 
If such a large angle does occur, the weight of its LSB is at least 1024 
radians so all significance is lost and an angle of zero is just as good as 
any other angle. 

The angle characteristic, E, is uned to compute a six-bit shift constant, S. 
Where S s 32 the product is not .shifted. Where S > 32 the product is shifted 
left S-32 places. Where S < 32 the product is shifted right 32-S places. Let 
Ei be the bit of E with weight 2**i and let Si be the bit of S with weight 
2**i. The shift constant, S, is computed with the following binary addition: 
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E7 E6 £5 E4 E3 E2 El EO 

0 E7 0 1 1 1 10 


T7 T 6 T5 T4 T3 T2 T1 TO 

For i = 0, 1, 2, 3. and 5, bit Si of S equals the logical-and of T7 and Ti. 

Note that where E = 98 the addition will produce a sum of 128, bits T 5 through 
TO will all be zero, and S will be zero. Where E < 98 the addition will 
produce a sum less than 128, bit T7 will be zero and S will be zero. Where E 

> 161 the sum will overflow its limit of 255, bit T7 will be zero and S will 
be zero. In all these cases S s 0 so the product, F * 2/pi, is right-shifted 
32 places to produce a zero angle. 

Where 98 < E < 128, the addition will produce a sum of E+30, T7 will be 1 and 
S will equal E-98 so the product, F * 2/pi, is right shifted 32-S s 1 30— E 
places. Where E = 128 or 129, the addition will produce a sum of E+94, T7 
will be 1, S will equal E-98, and the product, F * 2/pi, will be right-shifted 
32-S s 130-E places. Where 129 < E < 162, the addition will produce a sum of 
E+94, T7 will be 1, S will equal E-98, and the product, F • 2/pi, will be 
left-shifted S-32 = E-130 places. 

First, VFSC1$ computes the shift constant S and stores it in six bits of the 
temporary arrays. Then the product, F * 2/pi, is computed. Then the product 
is right or left shifted depending on S. 

When the routine computes the product, F * 2/pi, it leaves the MSB of the 
product in the B-plane, the LSB at the end of the planar shift register, and 
clears the A-plane to clear the bit to the right of the LSB. The routine then 
shifts the products left and right depending on the shift constant, S, to 
leave the bit with weight-pi (180 degrees) at the end of the planar shift 
register, the bit with 90-degree weight in the A-plane, and the bit with 
45-degree weight in the B-plane. This operation has three phases: 
pre-rotation, clearing, and post-rotation. Pre-rotation aligns the products 
so the subsequent clearing phase clears the correct bits. Where S > 32, the 
clearing phase clears the leftmost S-32 bits of the product and where S < 32, 
the clearing phase clears the rightmost 32-S bits of the product. 
Post-rotation performs the final alignment of the product. 

Where S < 32 (S5 s 0), the routine pre-rotates the products right 31 places 
(equivalent to a left rotation of one place), clears 32-S bits by shifting the 
products right while clearing the A-plane, and then post-rotates the products 
31 places (equivalent to another left rotation of one place). Where S5 = 1 (S 

> 3D* the routine pre-rotates the products right 63 -S places (equivalent to a 
left rotation of S— 3 1 places), clears S-32 bits by shifting the products right 
while clearing the A-plane, and then post-rotates the products right 63-S 
places. 
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The pre-rotation phase and the post-rotation phase are identical. Each 
rotation phase has five parts. In each part, the G-plane is loaded with a 
certain mask and then a number of SHIFTM A & HALFADDM instructions are 
performed to shift the products. The table below shows the mask and number of 
instructions for each part: 


Mask S Number 

of SHIFTM A & HALFADDM instructions 

S5 v S4 | 

| 

16 

S5 v S3 ! 

1 

8 

1 

S5 v S2 j 
1 

4 

1 

S5 v SI i 
1 

2 

1 

S5 v SO i 

1 


The clearing phase has six parts. In each part a number of SHIFTM, HALFADDM, 
& CLEARAM instructions are performed with the G-plane loaded with a certain 
mask as shown in the table below: 


♦- 

1 

1 

Mask ! Number of SHIFTM, HALFADDM, 

& CLEARAM instructions ! 

1 

1 

1 

1 


1 

1 

1 

1 

S5 ! 

i 

1 

1 

1 

1 


1 

1 

1 

S5 ® S4 { 

16 

1 

1 

» 

1 


1 

« 

1 

| 

S5 « S3 ! 

I 

8 

1 

1 

1 

t 


1 

1 

1 

S5 « S2 : 

4 

1 

i 

1 

1 


i 

1 

S5 9 SI ! 

2 

i 

l 

1 

1 


i 

1 

1 

S5 © so : 

1 

i 

1 

+- 





After the three phases of the alignment, VFSC1$ uses the sign of the angle and 
the two leftmost bits of the product (the bits with weights 180 degrees and 90 
degrees) to initialize Z, COSSGN, and SINSGN. The rest of the product is an 
angle in the first quadrant and is left in the planar shift registers. 
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The Z-plane equals the product bit with weight 90 degrees. The COSSGN-plane 
is the logical exclusive-or of the produot bits with weights 180 degrees and 
90 degrees. The SINSGN-plane is the logical exclusive-or of the product bit 
with weight 180 degrees and the sign bit of the angle. 

The VFSC1$ routine takes 469 machine cycles to execute. 


VFSC2$ ROUTINE 

The VFSC2$ routine is called twelve times by the MCU. The call index, N, 
ranges from 1 to 12. Let A(N) s arctan(2 # *(-N) ) ; A(1) a arctan(0.5) * 26.565 
degrees; A(2) = arctan(0.25) = 14.036 degrees; etc. Each call to VFSC2$ 
checks the values of the angles in the planar shift registers. Where the 
value is less than 45 degrees call-N adds A(N) to the angle and where the 
value is 45 degrees or more call-N subtracts A(N) from the angle. 

Initially the angles in the planar shift registers range from 0 degrees to 90 
degrees. After call-1 the angles will range from 45 - 26.565 = 18.435 degrees 
to 45 + 26.565 * 71.565 degrees. After call-N the angles will range from 45 
degrees - A(N) to 45 degrees + A(N). After call-12 the angles will range from 
44.986 degrees to 45.014 degrees. 

Twelve bit-planes in SNB are used to store the sign of the angle adjustment of 
each call. Bit-plane Fn = 1 where A(N) was subtracted from the angle and Fn = 
0 where A(N) was added to the angle. 

When the MCU calls VFSC2$ it initializes bits 0 through 31 of the PECU common 
register depending on the value of A(n). The value in the common register is 
D(n) defined as follows. For i a 0, 1, ..., 29. let a(n,i) be the bit of A(n) 
with weight 45 * (2 # *<-n)) degrees and let d(n,i) be the bit of D(n) put into 
bit i of the common register. Then d(n,29) * a(n.29) and d(n,i) s a(n,i) 9 
a(n,i+1) for i a 0, 1, ..., 28. Bits 30 and 31 of the common register are 
always set to 0. The following table shows how the left half of the common 
register should be initialized for n a 1, 2, .... 12: 
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+— 

i 

n 

i! Bits 0-31 of Common Register (hex) 

-+ 

1 

1 

1 

1 

Si DC B0 3C88 

1 

1 

2 

! i 6835 23B4 

1 

1 

1 

1 

3 

J ! 3CCC C9F0 

1 

1 

1 

1 

4 

1 1 1E74 5F14 

1 

1 

1 

5 

II 0F39 E08C 

1 

1 

1 

6 

i S 079C 6888 

1 

1 

1 

1 

7 

! ! 03CE 13FC 

1 

1 

1 

8 

:: 01E7 0BD4 

1 

1 

1 

1 

9 

i S 00F3 85C4 

1 

1 

1 

1 

10 

! 0079 C2A0 

1 

1 

1 

1 

11 

j i 003C El 50 

1 

1 

i 

l 

12 

i ! 00 IE 70A8 

i 

i 

«+ 


Each call to VFSC2$ takes 33 machine cycles. Twelve calls require 396 cycles. 


VFSC3S ROUTINE 

VFSC3$ computes the sine and cosine of the angles left in the planar shift 
registers. These angles range from 44.986 degrees to 45.014 degrees. Let 
such an angle be X ♦ 45 degrees where -0.014 degrees < X < 0,014 degrees. 
Then: 


sin(X * 45 deg) a sin(X) * cos(45 deg) + cos(X) * sin(45 deg) 

cos(X ♦ 45 deg) cos(X) * cos(45 deg) - sin(X) * sin(45 deg) 

But sin(45 deg) s cos(45 deg) s sqrt(0.5) so: 

sin(X ♦ 45 deg) * sqrt(0.5) • (eos(X) + sin(X)) 

cos(X ♦ 45 deg) s sqrt(0.5) * (cos(X) - sin(X)) 

Since X is so close to zero we can approximate sin(X) with X (in radians) and 
cos(X) with unity to obtain: 

sin(X ♦ 45 deg) s sqrt(0.5) * (1 ♦ X) 

cos(X + 45 deg) * sqrt(0.5) * (1 - X) 


The error magnitude in these approximations is less than 2.11 * 1 0* * ( —8 ) 
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When VFSC3I is called the right-half of the common register should be loaded 
with 09B7 4EDC in hex. VFSC3I takes 321 machine cycles to exeoute. 


VFSC4* ROUTINE 

The VFSC4$ routine is called twelve times. Call N causes a rotation of ♦ 
A(N). After twelve calls SNA and CS contain fixed-point versions of desired 
results. The well-known trigonometric identities: 

sin(Y ♦ Z) s sin(Y) * cos(Z) + coa(Y) * sin(Z) 

cos(Y «• Z) * cos(Y) * cos(Z) - sin(Y) * sin(Z) 

can be rewritten to obtain: 

sinCY ♦ Z) * cos ( Z ) • (sin(Y) + cos(Y) • tan(Z)) 

cos(Y ♦ Z) s cos(Z) * (cos(Y) - sin(Y) * tan(Z)) 

Let Z s ♦ A(N) so tan(Z) s + 2 ## (-N). Then: 

sin(Y + A(N) ) * cos(ACN) ) • (sin(Y) ♦ cos(Y)/(2*«N) ) 

sinCY - ACN)) * cos(ACN) ) » CsinCY) - cos(Y)/(2*«N) ) 

cos(Y ♦ ACN)) * cosCA(N) ) • (cos(Y) - sin(Y)/(2»»N) ) 

cosCY - ACN)) s cost ACN) ) » Ccos(Y) ♦ sin(Y)/(2»«N) ) 

If the cosCA(N)) factor in these equations is ignored for the moment then 
simple shifts and adds or subtracts generate the sine and cosine of Y + ACN) 
from the sine and cosine of Y. Twelve steps with N = 1, 2, ...7 12, 
respectively will generate the sine and cosine of the initial first quadrant 
angle from sinCX ♦ 45 deg) and cosCX ♦ 45 deg). Let K = cosCACD) * cosCAC2)) 
* ... * cos(A(12)). Rather than multiply by cos(ACN)) in each of the twelve 
steps VFSC3$ multiplies sinCX + 45 degrees) and cosCX ♦ 45 degrees) by K so 
after the twelfth call to VFSC4$ the results are correct. 

Odd-numbered calls to VFSC4$ CN = 1, 3, 5 11) use SNA as a source of the 

sine and put the new sine value in SNB. Even-numbered calls (N s 2, 4, 6, 

.... 12) use SNB as a source of the sine and put the new sine value in SNA. 

All calls to VFSC4I use CS as a souroe for the cosine and put the new cosine 

value back into CS. 

Call N to VFSC4$ takes 181 - 2*N machine cycles to execute. The twelve calls 
to VFSC4$ require a total of 2016 cycles. 
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VFSC5I ROUTINE 

VFSC5I floats a fixed-point value in SNA or CS and puts the result in the 
user's sine or cosine array. The SINCOSA mnemonic calls VFSC5I twice to float 
both results. SINA and COSA call VFSC5S once to float only the desired 
result. Each call to VFSC5I takes 133 machine cycles. 


TIMING 

The SINA and COSA mnemonics each require 3335 machine cycles to execute. Thus 
sines or cosines can be computed at a rate higher than 49 MOPS. 

The SINCOSA mnemonic requires 3468 machine cycles to execute. Thus if one 
wants both the sines and the cosines the rate is higher than 94 MOPS. 
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2.1.5 ARCTANGENT ARRAY SUBROUTINE 


DESCRIPTION : 

The HCL nneaonio ARCTNA computes an array of arctangents from an array of 
slopes. The form of the mnemonic Is: 

ARCTNA arctan, alope[ .temp] 

where arctan and slope are arrays of 32-bit VAX-F format floating-point 
numbers. No restriction is placed on the values in the slope array. The 
values computed in arctan will be in radians and range from -pl/2 to pi/2. 
The optional parameter, temp, is an array of 82 bit planes used for temporary 
storage - if temp is omitted then the routine uses bit planes 892 through 973 • 


METHOD 

Since the sign of y a arctan(x) is the same as the sign of x we ignore the 
sign until the very end. Thus, we assume that both x and y are non-negative. 

Let A(i) * arctan(2**(-i)) for 1 a 0, 1, 2, ... and let z be any angle. Then: 

tan(z) - tanCA(i)) 

tan(z - A(i) ) a ———————— 

1 ♦ (tan(z)Han(A(i))) 

Let tan(z) a N(i)/D(i) for some numbers N(i) and D(l) and let tan(z-ACl)) s 
N(i+1)/D(i+1). Since tan(A(i)) a 2*»(-i) we have: 

N(U1> ( N(1 )/D(i ) ) - <2*»C-i> ) 


D(U1) 1 ♦ (N(i)/(D(i) # (2 ## l) ) ) 

When we multiply the numerator and the denominator of the right side by D(i) 
we obtain: 


N(U1) N(i) - D(i)/<2»»i) 

— — S — ■a.in e iw— Miwa 

D(i+1 ) D(i) ♦ N(i)/(2»»i) 
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We equate the two numerator a and equate the two denominator to obtain: 

N(i+1) * N(i ) - D(i)/(2**i) 

D(i*1) a D(i) ♦ N(i)/(2**i) 

One method to compute y * arctan(x) ia to start with two numbers N(0) and D(0) 
auoh that N(0)/D(0) * x and initialize a third number, Y(0), to zero. Then we 
perform the following iteration atttp for i a 0, 1,2, ... ,n: 

WHERE ( (2 # *i) • N(i) > D(i) ) 

D(U1) a D(i) ♦ N(i)/(2**i) 

N(U1) a N(i) - D(i)/(2**i) 

Y(i+1) a Y(i) ♦ A(i) 

ELSEWHERE 

D(i+1) a D(i) 

N(i+1 ) a N(i) 

Y(i+1 ) a Y(i) 

ENDWHERE 

For i a 0, 1, 2, ... define the angle Z(i) such that tan(Z(i)) a N(i)/D(i). 
Then 2(0) a y and Z(0) ♦ Y(0) a y. The firat iteration, with i a 0, selects 
those places where the slope, N(0)/D(0}, is unity or higher (that is, where 
Z(0) is A(0) or higher) and subtracts A(0) from Z(0) to obtain 2(1). In the 
same plaoes it adds A(0) to Y(0) to obtain Y(1). In all places Y(1) ♦ Z(1) a 
y and 2(1) is less than A(0). Similarly, iteration i+1 selects those places 
where the slope, N(i)/D(i), is (2**(— i ) ) or higher (that is, where Z(i) is 
A(i) or higher) and subtracts A(i) from Z(i) to obtain 2(1^1) . In the same 
places the Iteration adds A(i) to Y(i) to obtain Y(i*1). In all places Y(i+1) 
♦ Z(i-fl) s y and Z(i+1) is less than A(i). Thus, the final iteration, with i 
3 n, leaves Z(n-fl) less than A(n) so Y(n+1) is within A(n) of the desired 
result, y. Since A(n) is less thsn (2 ## (-n)) for any n we have computed y to 
an accuracy of < 2**(— n ) ) . 

Lwt W(i) s (N(i)*N(i) ) ♦ (D(i)»D(i) ) for i * 0, 1, 2 n+1. Then in the 

places selected by iteration i+1, W(i+1) s W(!) * (1 ♦ (4***(-i))). In the 
remaining places W(i+1 ) = W(i) so in all places, W(i+1) < W(i) • (1 + 

(4»*(-i))). Thus: 

W(i*1 ) < W(0) * 2 • 1.25 • 1.0625 # ... • (1 ♦ (4«»(-i))) 

W(i+1) < W(0) • 2.71182 
Since D(i+1)*D(i+1) < W(i+1) we obtain: 

D(i+1) < 1.6468 * sqrt( W(0) ) 

Thus, D(i+1) can be bounded if W(0) can be bounded. Also we know that 
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N(i+1)/D(i+1) < (2**(-i)) so N(i+1) can also be bounded. 


The test for the where condition in the above method is not convenient so we 
modify the method as follows. For i = 0, 1, 2, ... let T(i) s ((2»*i)«N(i)) - 
D(i) so N(i) s (T(i)-fD(i) )/(2**i) . Then T(0) = N(0) - D(0) and the iteration 
step of the above method can be replaced by the following step which involves 
T instead of N: 


WHERE (T(i) > 0) 

D(i+1 ) 7 D(i) ♦ (T(i) + D(i) )/ ( 4* *i ) 
T(i+1) = 2*T(i) - D(i+1) 

Y(i+1) s Y(i) + A(i) 

ELSEWHERE 

D(i+1) s D(i) 

T(i+1) = 2*T(i) + D(i+1 ) 

Y ( i+ 1 ) = Y(i) 

ENDWHERE 


Note that as i gets large the adjustment, D(i+1) - D(i), approaches zero and 
this iteration step approaches the iteration step for the non-restoring 
division algorithm. Since N(i+1)/D(i+1) < (2**(-i)) we have -D(i+1) < T(i+1) 
< D(i+1) so T(i+1) is well-bounded. 


The input operand, x, is a floating-point number in VAX-F format. Where x is 
non-zero we have: 


E-128 

x = (F + 0.5) * 2 

where E is the characteristic in the range of 1 to 255 and F is the 23-bit 
fraction in the range of 0 to 0.5-(2**(-24)). Where x = 0 we have E = F = 0. 

Where E > 129 we have x > 1 so y = arctan(x) A(0). In these places we let 

N(0) s “F + 0.5 and initialize D(0) to 2*»( 128-E) so N(0)/D(0) = x. T(0) s 
N(0) - DCO) is initialized to F + 0.5 - 2**( 128-E), a non-negative number, so 
these places are selected by the first iteration step where i s 0. Where E > 
153, x is (2* :l 24) or higher so y is pi/2 — (2**(-24)) or more. When y is 
close to pi/2 the LSB of the y fraction has a weight of (2**(-23)) so the best 
value for y is simply pi/2. Thus, where E > 153 we initialize D(0) to 0 and 
T( 0) to F + 0.5 so N(0)/D(0) is infinite and we generate y s pi/2. 

Where E _< 128, x < 1 so the i * 0 iteration will not select this place. In 
fact, the first iteration that selects this place is when i = 1 29— E . Rather 
than waste the i : 0, 1,2, 128-E iterations we will start this place at 

i = 129-E. This means that there is an array of iteration numbers in 

temporary storage so each place can hold its own iteration number. The 
iteration number, I, is initialized to 12 r -E where E < 128 and initialized to 
0 where E > 129. Where 1 < E < 128 we initialize D(129-E) to 0.5 and T(129-E) 
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to F so NC 129-E) = CTC 129-E) + D(129-E)) • (2**(129-E)) = x/2 and 

N( 129-E)/D( 129-E) * x. 

Where E a 0 we have x a 0. The method assumes the hidden bit equals 1 making 
x a 2**C-129). It computes y a arctan(x) a 2 # *(-129) and when the hidden bit 
is dropped from y it stores zeroes so y a arctan(x) a 0. 

Thus, the method is initialized with the following rules: 


I = 129-E 

where E < 128 

1 = 0 

where E > 129 

DCI) = 0.5 

where E < 128 

DCI) = 2**( 128-E) 

where 129 < E < 152 

DCI) = 0 

where E > 153 


TCI) : F + 0.5 ■ DCI) 


Where E > 129 we have N(0) a F + 0.5 < 1 and D(0) a 2««(128-E) < 0.5 so W(0) < 
1.25. Where E _< 128 we have N(I) a x/2 < 0.5 and DCI) a 0.5 so W(I) < 0.5. 
Thus, D can be bounded by 1.6468 * sqrt(W(0)) < 1.8412. D is always 
non-negative so it can be held in an array whose leftmost bit has a weight of 
unity. Since -D < T < D, T is a signed quantity whose sign bit has a weight 
of -2. 

We compute y with an accuracy equal to the weight of the LSB of its fraction. 
We will perform 26 iterations with i a I, 1+1, 1+2, ..., 1+25. This will 
leave 0 _< ZCI+26) < ACI+25). If we add ACI+26) to YCl+26) the maximum error 
due to stopping the iteration at i a 1+25 will be no more than ACI+26). 

The LSB of T and D will have a weight of 2**(-26) so D has 27 places and T has 
28 places. This means that the DCi+1) a DC i ) + (TCi)+DCi) )/ ( 4«*i ) computation 
in each iteration step may have an error between -C2 # *(-27)) and 2**(-27) when 
I > 0. The minimum value for DCi+1) is 0.5 so the magnitude of the relative 
error is no more than 2**C-26). The relative error in tan(Z(i+1)) is no more 
than 2**C-26) so the maximum error in ZCi+1) is less than C 2** C— 26) ) *Z( i+1 ) < 
(2**C-26) )*A(i) . This error only occurs where TCi) _> 0, that is, only where 
A(i) is added to the final result, y. Thus, the contribution of this error to 
the final error is bounded by (2**C-26) )*y. 

Another error source is in the summation of the various A(n) terms to form y. 
This error is minimized by delaying the calculation of y until all iterations 
have been performed - the Y(n+1) calculation in iteration n is replaced by: 

F(n-I) s 1 where TCn) j> 0 
FCn-I) s 0 where T(n) <0 

The flag bit FC0) of the first iteration is always 1 and need not be stored. 
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Flag bits F(1), F(2), .... F(25) are 3tored In the temporary array until the 
end of the routine. As discussed above we always add A(I+26) to y so we 
assume F(26) a 1 as well. Thus, 

y a F(0) *A(I) + F(t)*A(I+1) + ... + F(26)»A(I+26) 

The Taylor series expansion for A(N) is: 


A(N) = 2 



Let B(N,J) = ( * (2**(-N*(2J*1))) / (2J+1) so A(N) a B(N,0) + B(N.I) + 
B(N ,2) + B(N,3) + ... Then: 


y a F( 0) *B(I,0) + F(0) *B(I, 1 ) + F(0)*B(I,2) ♦ ... 

+ F( 1)*B(I+1 ,0) ♦ F( 1)*B(I+1 , 1) + F( 1)*B( 1+1 ,2) + ... 

♦ F(26)*B(I+26,0) + F(26)*B( 1+26, 1 ) + F(26) *B(I+26,2) + ... 


We can change the order of summation to obtain: 

y a C(I,0) + C(I,1) + C(I,2) + C(I,3) + C(I,4) + C(I,5) + ... 

where C(I,J) a F(0)»B(I,J) ♦ F( 1)*B(I+1,J) + ... + F(26)»B(I+26. J) 

We want to compute y as a floating-point number. Where I a 0 we have pi/4 < y 

£ pi/2 and where I > 0 we have AC I) £ y < A(I-1). So pi/4 £ (2**1) *y < 2 

everywhere. Thus, we first compute (2**I)*y which needs only one 

normalization step to put it into the range of floating-point fractions (0.5 

to 1) and then compute the exponent of y. 

Let D(I) a C(I,5) + C(I,6) + C(I,7) ♦ ... so (2**I)*y a (2«*I)*(C(I,0) + 

C(I.I) ♦ C( I , 2) + C(I,3) + C(I,4) + D(I) ). We compute (2**I)*y as follows: 

(1) - Compute (2**(9*I)) * C(I,4) * (4095/4096) and shift it right 21 

places. 

(2) -Add (2**(7*I) ) * C( I , 3) * (4095/4096) to the result of (1) and 
shift the sum right 21 places. 

(3) - Add (2**(5*I) ) * C( I ,2) * (4095/4096) to the result of (2) and 

shift the sum right 21 places. 

(4) - Add (2**(3 # I) • C(I,1) * (4095/4096) to the result of (3) and shift 
the sum right 21 places. 
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(5) - Multiply the result of (4) by 4096/4095. 

(6) - Add (2**1) « DC I ) to the result of (5). 

(7) - Add (2**1) * C(I,0) to the result of (6) to obtain (2**1) * y. 

Let F(0) = 1, F( 1 ) = a, F(2) = b, and F(3) * c. Then (2**<9»;)) * C(I,4) * 
(4095/4096) equals the following binary fraction: 

0.000 111 000 111 aaa 000 aaa bbb 000 bbb ... 

which can be formed in the P-register (LSB first) and entrred into the shift 
register in step (1). 

Similarly, -(2**(7*X) ) * C(I,3) » (4095/4096) equaic the following binary 

fraction: 

0.0010010 OlaOlaO OabOabO ObcObcO ... 

which can be formed in the P-register and subtracted from the shift register 
in step (2). 

Similarly, (2*»(5*I)) * C(I,2) • (4095/4096) is the sum of the following two 
binary fractions: 


0.00110 01100 1 lbbO ObbOO bbddO OddOO ... 

0.00000 OOaaO OaaOO aaccO OccOO cceeO ... 

where F(4) = d and F(5) = e. Each of these fractions is formed in the 

P-register (LSB first) and added to the shift register in step (3). 

Similarly, — ( 2**( 3*1) ) * C(I,1) * (4095/4096) is the sum of the following two 
binary fractions: 

0.010 lal aba bcb cdc ded efe fgf ghg hih ... 

0.000 000 010 lal aba bcb cdc ded efe fgf ... 

where F{6) s f, F(7) = g, F( 8) = h, and F(9) = i. Each of these fractions is 
formed in the P-register (LSB first) and subtracted from the shift register in 
step (4). 

In step (5) we multiply the shift register contents by 4096/4095. But: 

4096 4097 (2*»24)+1 2**48 

4095 4096 2**24 (2**48)— 1 

With a relative error of (2**(-48)) we perform the multiplication by adding 
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the shift register to itself shifted right 24 places and then adding the shift 
register to itself shifted right 12 places. 

The magnitude of DC I) is less than ( 2**C— 1 1*1) )/1 1 so for I £ 3 we have 
(2**I)*D(I) < (2**(-30))/11 which can be neglected. Thus, DC I) only has 
significance for I * 0, 1, and 2. Letting a s F(1) and b = F(2) we can find 
the following binary fractions: 

-D(0) * 0.0000 1100 1010 1 la 1 laaO aOOa baba ... 

-2D( 1 ) s 0.0000 0000 0000 0100 1100 111a OaOa ... 

-4D(2) = 0.0000 0000 0000 0000 0000 0001 0110 ... 

In step (6) we select the places where I s 2 and add 4D(2) to the result of 
(5), then we select the places where I = 1 and add 2D(1) to the result of (5), 
and then we select the places where I = 0 and add D(0) to the result of (5). 

Since (2**1) * C(I,0) is simply l.abcd efgh ..., step (7) is the addition of 
the appropriate F(n) values to the corresponding bits of the result of (6) to 
form (2**1) * y. 

The magnitude of the error in this computation of (2**I)*y is less than 13.37 
* (2**(-3Q)). If (2**1) *y is less than unity then the A(I+26) error is at 
most 16 * (2**(-30)) and the D(i+1) error contribution is at most 16 * 
(2**(-30)) so the maximum error is 45.37 * (2**(-30)). The round-off error in 
rounding the answer to the VAX-F format is at most 32 * (2**(-30)) giving a 
worst case error of 1.21 times the weight of the LSB of the final fraction. 
If (2**I)*y is unity or greater the weight of the LSB is doubled so the effect 
of the worst error is smaller. 


TEMPORARY STORAGE 


The 82-plane temporary storage region has the following layout: 

27 1 4 27 

! IS S! ! 

I T ! 1 1! D ! 

I 14 I I 

14 


IF 

F 

F 

FI 

11 

1 

1 

21 

l 

4 

5 

51 


•25' 


The first 25 planes store FC 1 ) through F(25), respectively. Planes FC 15) 
through F(25) are overlayed by the first 11 bits of the T array - array T is 
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not used to compute F(15) through F(25). Array T is only 27 bits long since 

there is no need to store its MSB - the complement of its MSB equals F(1), 

F (2), etc. on successive iterations. 

The next fourteen planes store S14, S 1 3 . S12, .... SI, respectively, where Sn 
= 1 only where I > n. Thus, SI flags where I > 0 and S14 flags where I > 13. 

The last 27 planes store the D array. 


VFATN$ ROUTINE 

The VFATNI routine computes the arctangents of elements of the x array and 
places the results in corresponding elements of the y array. Both x and y are 
arrays of 32-bit floating-point numbers in VAX-F format. The routine is 

called with the following setup: 

Common Register s FE6A C3DD 6CCD 994E (in hex) 

R3 s LSB of y array 

R4 s MSB of temporary storage 

R5 = R4 ♦ 40 s R6 - 41 

R6 s LSB of temporary storage 

R7 s LSB of x array 

If the default temporary array is being used then R4 = 892, R5 = 932, and R6 = 
973. 

The basic parts of the VFATN$ routine are: construct SI through SI 4; perform 

iteration I; perform iterations 1+1 through 1+14; perform iterations 1+15 
through 1+25; and construct y. These are described in the following sections. 


Construct SI through S14 - The following diagram shows the values in planes SI 
through SI 4 as a function of the initial iteration number I: 
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I ssssssssssssss 
123^5678911111 
0 12 3 4 


0 0 0 0 0 0 

1 1 0 0 0 0 

2 110 0 0 

3 1110 0 

4 11110 

5 11111 

6 11111 

7 11111 

8 11111 

9 11111 

10 11111 

11 11111 

12 11111 

13 11111 

>14 11111 


000000000 
000000000 
000000000 
000000000 
000000000 
000000000 
100000000 
1 1 0 0 0 0 0 0 0 
1 1 1 0 0 0 0 0 0 
1 1 1 1 0 0 0 0 0 
1 1 1 1 1 0 0 0 0 
1111110 0 0 
11111110 0 
111111110 
111111111 


First the PE shift registers are set to a length of 14. Thirteen zero planes 
are shifted into the shift registers as the routine computes SI 4 in P. Let E 
be the characteristic of x and let E7. E6, .... E0 be the bits of E where E7 
is the MSB and E0 is the LSB. S14 s 1 where I 2 14. Since I = 129 - E, then 
S14 s 1 where E < 115. The logic equation for S14 is: 


SI 4 s E7 (E6 v E5 v E4 v E3 E2) 

Then the complement of E0 is shifted into the shift registers. Then eight 
ones are shifted into the shift registers where SI 4 v (E3 © (E2 v ED) = 1 
(where I 2 14 or where E mod 16 s 2, 3i ...» 8, 9). Then four ones are 
shifted into the shift registers where S14 v (E2 9 El) s 1 (where I 2 14 or 
where E mod 8 = 2, 3» 4, or 5). Then two ones are shifted into the shift 
registers where S14 v El s 1 (where I 2 14 or where E mod 4 : 2 or 3). Where 
E < 129 we shift the shift register into S13 through SI, respectively. Where 
E > 129 we shift zeros into S13 through SI. This completes the construction 
of SI through SI 4. The complement of SI is left in the P register for use in 
the next part. Construction of SI through SI 4 requires 49 machine cycles. 

Perform Iteration I - The shift register lengths are set to 26. First D(I) is 
constructed by clearing the shift registers to zeros and setting register B to 
one. Where SI s 0 (where E > 128) the shift register is shifted right E - 129 
places. The first shift clears the B register, the shift registers. 


- 2-39 - 


MPP SCIENTIFIC SUBROUTINES 


GOODYEAR AEROSPACE 
CORPORATION 
GER-17221 


2.2 HOL Interface Requirements for Array Subroutines 


All MPP Array functions require loading the MCU Call Queue 
registers with the arguments of the function with an MCU program, 
and then calling a specific FECU Subroutine. This is usually 
performed by an MCL MACRO, or by some Higher-Order-Language (HOL). 

This Section describes the minimum requirements for a HOL or MACRO 
to use the Scientific Functions. 

Note that the subroutines ARCTNA, and SQRTA call the PECU directly, 
and that LNA, EXPA, SINA, COSA AND SINCOS call MCU subroutines that 
perform iterative PECU calls. 

Tables 2.2.1 - 2.2.6 describe HOL interfaces for each Array subroutine. 
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Table 2.2.1 LNA - Natural Logarithm Subroutine 

HOL Interface Specifications 


Name : 

Description : 

» 

Global MCU Names : 


LNA (X,Y,E,T) 
Compute Y * LN(X) 

LN$V 


Required Arguments : X - Input Array in 32-bit VAX format 

Y - Destination Array in 32-bit VAX format 
E - Error Output Status bitplane: 

Set if source ’X 1 was Negative 
Clear Otherwise 

T - Temporary Storage Array of 56 bitplanes 


Required Main Control Queue Registers : 


Scalar Queue Registers : 


R32 - ' PEMODE ' Code: 68«4 
R33 - Cleared to '0» 

R34 - High Half of Ln(2) s X , B172 f 
R35 - Low Half of Ln(2) * X* 1 6B9 * 


PECU Queue Registers 

R36 - N/U 
R37 - LSB of 
R38 - N/U 
R39 - LSB of 
R40 - LSB of 
R4l - LSB of 
R42 - LSB of 
R43 - LSB of 


• T* Temporary Array 

•T* Temporary Array 
* E * Error Bitplane 
f X' Source Array 
'T' Temporary Array 
'Y* Destination Array 


• Call via : CALL R15,LN$V 

After Loading Queue Registers 
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Table 2*2*2 EXPA - Exponential Subroutine 

HOL Interface Speclfioationa 


Name : EXPA (X,Y,E,T) 

Description : Compute Y * e**X 


Global MCU Names : EXP$V 

Required Arguments : X - Input Array in 32-bit VAX format 

Y - Destination Array in 32-bit VAX format 
E - Error Output Status of 3 bitplanes: 

E(0) Set if input < 2**-31; output set 
to X’ 40800000* (VAX *1*). 

E(1) Set if overflow; output set to 
X*7FFFFFFF* (max VAX value). 

E(2) Set if underflow; output set to 

X*0' (VAX 0 ). C ) 

T - Temporary Storage Array of 43 bitplanes 


Required Main Control Queue Registers : 

Scalar Queue Registers : 

R32 - High Half of l/ln(2) : X'453F 

R33 - Low Half of 1/ln(2) : X'D630* 

R34 - N/U 

R35 - N/U 

PECU Queue Registers : 

R36 - N/U 
R37 - N/U 
R38 - N/U 

R39 - LSB of *T* Temporary Array 
R40 - LSB of *E' Error Array 
R41 - LSB of *X* Source Array 
R42 - N/U 

R43 - LSB of * Y' Destination Array 


• Call via : CALL R15,EXP$V 

After Loading Queue Registers 
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Table 2.2.3 SQRTA - Square Root Subroutine 

HOL Interface Specifications 


Name : 

Description : 

« 

Global PECU Names : 


SQRTA (X,Y ,E t T) 
Compute Y * SQRT(X) 
SQRTV$ 


Required Arguments : X - Input Array in 32-bit VAX format 

Y - Destination Array in 32-bit VAX 
E - Error bitplane, Set if X was Negative 
T - Temporary Storage Array of 22 bitplanes 


Main Control Queue Registers : 


Scalar Queue Registers : 

R32 - N/U 
R33 - N/U 
R34 - N/U 
R35 - N/U 

PECU Queue Registers : 

R36 - N/U 
R37 - N/U 
R38 - N/U 
R39 - N/U 

R40 - Error bitplane 

R41 - MSB of 'T' Temporary Array 

R42 - LSB of * X * Source Array 

R43 - LSB of ' Y' Destination Array 


* Call via : LR R44,SQRTV$ 
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Table 2.2.4 SIN, COS - Sine, Coalne Subroutine 

HOL Interface Specifications 


Name : 


SINA, COSA ( X, Y,T) 


Description : 


Compute Y * SIN(X) , COS(X) 


Global MCU Names : SNCS$V 


Required Arguments : X - Input Array in 32-bit VAX format 

Y - Destination Array: Function in 32-bit VAX 
T - Temporary Storage Array of 90 bitplanes 


Main Control Queue Registers : 


Scalar Queue Registers : 

R32 - N/U (Set within SNCS$V) 

R33 - N/U 
R34 - N/U 
R35 - N/U 

PECU Queue Registers : 

R36 - N/U 
R37 - N/U 

R38 - LSB of ’ Y * Destination Array 
R39 - N/U 

R40 - M» for SINE; '2' for COSINE 

R41 - LSB of 'X' Source Array 

R42 - LSB of Source Exponent (R41-23) 

R43 - LSB of Temporary Storage Array * T* 


• Call via : CALL R15,SNCS$V 
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Table 2.2.5 SINCOS - SineCosine Subroutine 

HOL Interface Specifications 


Naae : 

Description : 

« 

Global MCU Names : 


SINCOS (X,Y,Z,T) 

Compute Y « SIN(X), and Z > COS(X) 
SNCSIV 


Required Arguments : X - Input Array in 32-bit VAX format 

Y - Destination Array 'SIN» in 32- bit VAX 
Z - Destination Array 'COS’ in 32-bit VAX 
T - Temporary Storage Array of 90 bitplanes 


Main Control Queue Rogisters : 


Scalar Queue Registers : 

R32 - N/U (Set within SNCS$V) 

R33 - N/U 

R34 - N/U 

R35 - N/U 

PECU Queue Registers : 

R36 - N/U 

R37 - N/U 

R38 - LSB of * Y' Destination Array 

R39 - LSB of * Z * Destination Array 

R40 - f 4» denotes Sine and Cosine 
R4 1 - LSB of f X' Source Array 
R42 - LSB of Source Exponent (R41-23) 

R43 - LSB of Temporary Storage Array 'T' 


• Call via : CALL R15,SNCS$V 
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Table 2.2.6 ATANA - Arctangent Array Subroutine 

HOL Interface Specifications 


Name : 

Description : 

* 

Global PECU Name? : 
Required Arguments : 


ARCTNA (X,Y,T) 

Compute Y a ARCTAN(X) 

ATANV$ 

X - Input Array in 32-bit VAX format 

Y - Destination Array in 32-bit VAX 

T - Temporary Storage Array of 82 bitplanes 


Main Control Queue Registers : 


Scalar Queue Registers : 

R32 - X'FE6A* 

R33 - X'C3DD» 

R34 - X»6CCD' 

R35 - X»994E‘ 

PECU Queue Registers : 

R3f> - N/U 
R37 - N/U 
R38 - N/U 

R39 - LSB of Destination Array *Y* 
R40 - MSB of Temporary Array *T • 
R41 - LSB-41 of Temporary Array 
R42 - LSB of Temporary Array 
R43 - LSB of Input Array ’X 1 


* Call via : LR R44,ATANV$ 
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3.0 SEQUENTIAL MCU ALGORITHMS 


The MCU sequential (or serial) algorithms are described in this section. The 
routines are 'sequential* in that a single 32-bit VAX input generates a single 
VAX output. Each subroutine requires that input data be loaded into specific 
MCU registers. Upon completion of each subroutine, the output function and 
error status will also be contained in specific MCU registers. The specific 
form of the 32-bit VAX real is described in Section 3*3* 

3.1 GENERAL DESCRIPTION OF THE POLYNOMIAL METHOD 


The iterative algorithms employed to implement the array function modules are 
most efficient for processors that have no hardware multiplier resources. 
Because the MCU has an embedded 16 bit hardware multiplier as well as a 
hardware adder resource, an algorithm that makes effective use of these 
resources is used in place of the iterative algorithms. The algorithm used is 
the familiar Discrete Orthonormal Legendre (DOL) polynomial fitting algorithm. 
For each function, the valid domain of the independent variable is segmented 
into connected intervals. Within each interval, a polynomial that approximates 
the function to a specified level of precision (or accuracy) is computed. 

The form of the polynomial used to match the scalar function f(u) over an 
u-interval is: 

1 ) p ( u ) s AO+u* ( A 1 +u* ( A2+u* ( A3+u* ( A4+u* ( A5+u # ( A6+ . • .+u* (AN ).*.))))) ) 
where 

o the AC j ] values (j=0,1 ,2,...,N) are coefficients to be found while 
trying to force p(u) to approximate the true function, f(u), over 
the interval u[k] <= u < u[k+1], k=0,1,...,K , 

o u represents the independent input variable, 
o k identifies the "k"th interval of the domain of u, 
o K+1 specifies the total number of intervals that comprise the u 
domain, and 

o p(u) is the polynomial function. 

Generally, the degree, "N", of the polynomial (and the number of operations 
required to evaluate the polynomial) needed to approximate the function f(u) 
to a given level of accuracy is reduced as K (the number of domain intervals) 
is increased* It follows that K should be large in order to decrease the 
number of operations (and execution time) required to compute "p". Within 
certain "reasonable" K intervals, the statement above is true. However, if K 
is too large, too many branching operations are required to identify the 
particular interval in which an input "u" lies. Also, as K increases, the 
amount of storage required for the "A" coefficients of all intervals tends to 
increase even though the storage requirements for any one given interval tends 
to decrease. 
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To implement the scalar function modules, K values as small as 2 and as large 
as 14 have been used. In general, the degree of approximating polynomials has 
been low when K is high. In no case has the degree of the polynomial been 
allowed to exceed 7* Execution times for modules utilizing small K values can 
be speeded up by factors of 2:1 simply by increasing K. (In such case, more 
memory would have to be expended for the modules.) 

The input and output variable values for each of the scalar functions are 32 
bit VAX real (floating point) values. Usually the "u" value used in the 
polynomial of Eq* 1* is a biased or scaled and biased version of the mantissa 
of the input real variable. The implemented version of the Eq* 1* polynomial 
(POLY32) uses an input "u" that is a signed magnitude fraction that has the 
magnitude fora (0.0.32) • (Note: The number form (s.i.f) describes the number 
of sign bits, s, the number of integer bits, i, and the number of fractional 
bits, f.) Two MCU registers are used to store the "u" magnitude; an additional 
16 bit register is used to store the sign of w u w * (When the sign register 
contains hex 8000, "u" is positive. If the register is loaded with a "0", "u" 
is negative. No other values are valid "u" sign indicators.) The permitted "u" 
magnitude can range from "0" to less than 1 . 

The AC j 3 coefficients of the polynomial function are 2's complement numbers of 
the fora (1.32.0)*(2**(-32)). Thus, they may take on any value that lies in 
the interval (-.5) <* A[j] < (+.5) . 

The ”p B value generated by the POLY32 routine is a 2's complement number of 
the same fora as A[j], namely, ( 1.32.0)*(2**(-32))» It may also take on any 
value that lies in the interval (-.5) <= p < (+.5) . 

To evaluate functions using P0LY32, Eq* 1* is evaluated from right to left. 
First, AN multiplies u. Then the result is added to A[N-1], This result again 
multiplies u, etc., until the A0 addition is completed. To accomplish the 
multiply using the 32 bit inputs specified for the P0LY32 routine required the 
development of a small 32 bit multiply routine, MULT32, that would accept one 
operand of the fora of A[j], another operand with the form of "u", and would 
produce a result with the fora of "p" (i.e», A[j]). To make best use of the 16 
bit MCU hardware multiplier to perform a 32 bit times 32 bit multiply, 
cardinal multiplies are performed. 

Thus, MULT 3 2 first converts the 2's complement input, B, to the routine (e.g., 
A[j ] ) to a signed magnitude form like that of "u". Then the high 16 bits of 
"u" and the high 16 bits of the magnitude of B are cardinal multiplied. Also, 
the low 16 bits of "u rt cardinal multiply the high 16 bits of the magnitude of 
B; and, in like manner, the low 16 bits of the magnitude of B cardinal 
multiply the high 16 bits of "u". The three products (with appropriate 
offsets) are added to fora a 32 bit cardinal fraction with a precision of no 
worse than ♦ or - 2**(-31). 
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Using the sign of "u M and B, the result is converted back to the same 2's 
complement form as B and presented as MULT32 output# It should be noted that 
the MULT32 output permits the immediate additions required by Eq* 1 * Also, 
the output of the addition is of the form of B and so can be used immediately 
as input to the MUIT32 routine, again, as required by Eq* 1 . 

The MULT32 routine is implemented so as to provide at least 6 guardbits during 
each P0LY32 multiply operation. Since no more than 7 adds are employed in 
POLY32 and the multiplies have virtually no impact on the variance of each 
sum, the variance of POLY32 is only about twice that of an "A" coefficient; 
POLY32 has at least 5 guard bits. 

The software modules developed to implement the scalar functions, P0LY32, and 
MULT32 are subroutines and so are re-entrant in character. Prior to a call to 
any of the modules discussed, values and addresses of values needed to execute 
the routines are pre-loaded into MCU registers allocated to the routines. 
Results of executions are returned in MCU registers. 

The call of any scalar funccion module will result in the automatic load of 
both the P0LY32 and MULT32 routines. Independent of the number of same or 
different scalar function module calls made by a user's program, these modules 
will be loaded once only. If no scalar function modules are called, these 
modules will not be loaded. 

The scalar routines use all MCU registers. Users must 3 ave register values 
they want to preserve prior to calling any scalar module. 

Section 3.3 describes the interface register requirements for each function. 

The following section, Section 3«2, describes each function algorithm for the 
MCU subroutines. The descriptions are given in program Design Language (PDL) 
form. In addition, the PDL steps have been numbered for reference and include 
fractional partitioning using odd or even numbers to assist in identifying 
logical paths. 

Appendix A describes the generation of function interval polynomial 
coefficients for each of the MCU functions. 


i- 
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3.2 DESCRIPTION OF MCU ALGORITHMS 


3.2.1 MCU SQUARE ROOT SUBROUTINE : SQRTM 


This Subroutine develops the value, "Y", the square root of the input 
variable, B X". "X", the input, and "Y", the output, are 32 bit VAX floating 
point numbers. Along with "Y", a 16 bit status value, S, is generated for 
output; Ss3 indicates a negative X. The "Y" value for such X has no meaning; 
only positive "X" values are permitted as input arguments. 


When the exponent of "X" is even, the routine demands the calculation of 
y1s(SQRT(w)) where .5 <* w < 1 ; thus, the range of yl is (.707...) <= yl < 1. 

The value of "yl" is established using the polynomial, "pi", given by 

p1=A10*(U**0)+A1 1*(U # *1 )+A1 2*(U**2)+A1 3 # (U**3)+. .» ».+A1N*(U**N) 

where U=2*(w-.75) ; thus, the range of pi is ( .707... -.75) <= pi < (1-.75) 


The polynomial "pi" is computed from right to left using 

p1=A10+U*(A1UU»(A12+U»(A13+U«(Al4+U*(A15+U»(Al6+.....+U»(A1N) . 

The P0LY32 routine used to compute pi assumes that -1/2 <= U < 1/2 and "U" has 
the signed magnitude format [S, (0.31 .0) ]*2 # *(-32) and that pi lies in the 
range -1/4 <s pi < 1/4 (it does) and has the 2's complement format 

( 1 .31 »0)*2**(-32) . 

The starting location of the memory space that stores the "A1" coefficients 
needed to compute pi and then yl is C0EF1. The coefficient data are assumed 
stored in the sequence: 

Address Item 

C0EF1+ 0 A10(hi) 

C0EF1+ 2 A10(lo) 

C0EF1+ 4 A11(hi) 

C0EF1+ 6 A11(lo) 

• • 

• • 

C0EF1+4*N1 AIN(hi) 

C0EFU4»N1+2 AIN(lo) ; 

N1 , the degree of the "pi" polynomial, is defined within this subroutine. 
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Ysy1*(2**( (EX-128)/2) ) 
where EX is the VAX biased exponent of X. 


When the exponent of M X M is odd, the routine demands the calculation of 
y2sSQRT(w/2) 

where .5 <* w < 1 ; thus, the range of y2 is .5 <* y2 < (.707...) . 

The value of "y2" is established using the polynomial, "p2", given by 

p2=A20*(U«0)+A21»(U*«1)+A22*(U**2)+A23*(U**3)+ +A2N*(U*»N) 

where U=2*(w-.75) ; thus, the range of p2 is ( .707. . .-.75) <= p2 < (1-.75) . 

The polynomial w p2" is computed from right to left using 

p2=A20+U* ( A2 1+U • ( A22+U» ( A23+U» ( A24+U* ( A25+U» ( A26+ +U» ( A2N ) . 

The POLY32 routine used to compute p2 assumes that -1/2 <= U < 1/2 and "U" has 
the signed magnitude format [S, ( 0 . 3 1 .0) ]*2**(-32) and that p2 lies in the 
range -1/4 <s p2 < 1/4 (it does) and has the 2's complement format 

(1.31.0)*2»*(-32) . 

The starting location of the memory space that stores the M A2 W coefficients 
needed to compute p2 and then y2 is C0EF2. The coefficient data are assumed 
stored in the sequence: 

Address Item 

C0EF2+ 0 A20(hi) 

C0EF2+ 2 A20(lo) 

C0EF2+ 4 A21 (hi) 

C0EF2+ 6 A21 (lo) 

C0EF2+ 8 A22(hi) 

C0EF2+ 10 A22(lo) 


C0EF2+4*N2 A2N(hi) 
C0EF2+4*N2+2 A2N(lo) ; 


N2, the degree of the "p2" polynomial, is defined within this subroutine. 
Once y2 is computed, the output floating point Y value is given by 
Y=y2*(2* # ( (EX-128+D/2) ) 
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where EX is the VAX biased exponent of X. 


For both the "pi" and n p2 w polynomials above, coefficients of the polynomial 
are assumed to have the same format as is used for the polynomial value. The 
M A1" and tt A2 M coefficient blocks for both polynomials are stored as part of 
this subroutine. 

The degree of the polynomials is at least 1 . 


The entry branch and link register for this subroutine is RF. The subroutine 
calls the P0LY32 subroutine by way of register RF. The P0LY32 subroutine , in 
turn, calls the subroutine, MULT32.MS, as an internal subroutine (i.e,, no BAL 
register is used) . 

Registers directly required by this subroutine are marked with a . 
Registers indirectly required by the POLY32 routine are marked with a . 
Registers indirectly required by the MULT32.MS routine are marked with a 


Register: 1 
1 

RE! RD! RC! 

i i 

RB! RA! 
1 

R9! 

i 

R8! 

1 

R7! 

1 

R6! R51 M 

t 1 

R3i 

1 

R2 1 

t 

R1 ! 

1 

RO! 

1 

SQRTM Use:! 

*!*!*! 

* ! » ! 

t 

• ; 

1 

ft | 

1 

ft j 

1 1 

I i 

1 l 

1 

ft ; 

t 

• j 

1 

ft J 

» | 

POLY32 Use:! 

! # ! ! 

! # ! 

# ! 

# : 

# : 

l i 

i i 

i 

i 

# ! 

1 

\ 

# ! 

MULT32.MS Use:! 

! $ ! 1 

i $ ! 

$ ! 

i 

i 

i 

i 

! ! $ 

$ ! 

$ ! 

$ ! 

$ ! 


ON ENTRY: 
R9=Xlo 
RB=Xhi 

ON EXIT: 
R0=Ylo 
R2=Yhi 
• bp 


1. SQRTM entry . 
Register:! RE 


Use: 


RD 


RC 


RB! 

| 

RA! R9 

i 

Xhi! 

!Xlo 


R8! 


R7! 


R6 ! 


R5 ! 


F4 ! 


R3! 


R2! 


R1 


ROi 
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2. REsO (Set status to 0.) 


Register:! RE! RD 

1 1 1 

RC! 

i 

RBi 

RA! R9! 

j 1 

R8 j 

i 

R7i 

1 

R6! 

1 

R5! 

1 

R4 

1 R3! 

i t 

R2 ! 

i 

R1 

! R0! 

1 1 1 

Use:! C! ! 

! Xhi ! 

i Xlo! 

i 

i 

t 

1 

1 

1 

1 

1 

1 

1 

» 

1 


t < 

i i 

i i 

i 

i 

i 



IF RB=0 (Check for 

X=0 

; return with 

Y= 

0 in 

such case. 

Also 

, check 

for 

Xs negative 

; abort 

with status 

value of 3 

in 

such 

case 

.) 


Register:! RE! RD! 

i i i 

RC! 

1 

RB! 

1 

RA! R9! 

| J 

R8! 

i 

R7! 

t 

R6! 

1 

R5! 

1 

R4 

! R3! 
1 1 

R2! 

i 

R1 

! R0 

t 1 1 

Use:! 0! j 

1 

1 

1 

0! 

J Xlo! 

i 

i 

i 

1 

1 

1 

1 

1 

1 

1 

1 

1 


1 1 
1 1 
1 1 

i 

i 

i 


I 1 

1 J 

IF R9=0 













R0=0 . 

(Ylo=0. 

) 










R2=0 . 

( Yhi=0. 

) 










RETURN 

(by 

way 

of RF). 










Register:! RE! RD! 

RC! 

RB! 

RA! R9l 

R8 ! 

R7! 

R6! 

R5! 

R4 

I R3i 

R2 ! 

R1 

! R0 ! 

1 1 1 
1 1 1 

1 

1 

i 

1 1 
1 1 

i 

i 

1 

1 

« 

1 

i 

i 


i i 

i i 

(0)1 


! (0) ! 

Use:! 0! i 

1 

1 

0! 

! 0! 

i 

i 

1 

1 

1 

1 

i 

i 


i i 

i t 

Yhi! 


! Ylo ! 

Else 













Continue 












Register:! RE! RD! 

i i ! 

RC! 

1 

RB! 

i 

RA! R9! 
1 1 

R8| 

i 

R7! 

1 

R6 ! 
1 

R5! 

i 

R4 

! R3! 

i i 

R2i 

i 

R1 

! R0! 

Use:! 0! ! 

1 

1 

1 

0! 

1 1 
! Xlo! 

i 

i 

i 

1 

1 

1 

1 

» 

» 

i 

i 

i 


i t 

t i 

i i 

i 

i 

i 




End If. 


Else 

IF RBsnegative 


Register: ! 

1 

RErX'0003' 
RETURN (by 
RE! RD! RC! 

i i i 

way 

RB! 

i 

of RF). 
RA! R9i 

j i 

R8! 

1 

R7! 

1 

R6| 

i 

R5! 

k 

R4 | 

i 

R3! 

1 

R2 j 

i 

R1 ! 
1 

R0 

Use:! 

i i i 

3! ! ! 

Xhi! 

i i 

! Xlo! 

1 

1 

1 

1 

1 

1 

i 

i 

i 

1 

1 

1 

i 

i 

i 

1 

1 

1 

i 

i 

i 

1 

1 

1 


Else 

Continue 

Register:! RE! RD! RC{ 
1 1 1 1 

RB! 

i 

RA! R9! 

i i 

R8! 

1 

R7! 

1 

R6! 

i 

R5 ! 
1 

R4 1 

i 

R3! 

1 

R2 | 

i 

R1 ! 
1 

RO 

1 

Use: ! 

1 1 1 

0! i ! 

i 

Xhi! 

i i 

i xio i 

1 

1 

1 

1 

1 

1 

i 

i 

i 

1 

1 

1 

i 

i 

i 

1 

1 

1 

i 

i 

i 

1 

1 

1 



End If. 


End If. 
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4. (RB, RC)*RB*X , 0200' f cardinal multiply. 

(Put X exponent (and ”0 B sign bit), 
right Justified, into RB; put true 
mantissa-. 75 into (R9, RA) (radix 
point for X true mantissa is 1 bit 
position left of left edge of R9). 
EX is biased exponent of X.) 


Register:! RE| RD| RC! RB! RA! R9i R8 ! 

i i i i i i i i 

R7 

R6 

R5! 

1 

R4j 

* 

R3 

R2 ! 
1 

R1 1 

i 

R0! 

i i i t i i i i 

Use:! 0! imhl i EX! ! Xlo! ! 



I 

t 

1 

i 

i 

i 


1 

1 

1 

i 

i 

i 


(R9, W=H9WQ200' , cardinal multiply. 
Register:! RE! RD! RC! RB! RA! R9! R8| 

) i t i t i 

R7 

R6 

R5! 

1 

R4| 

i 

R3 

R2 ! 
1 

R1 ! 

i 

R0! 

i i i i i i 

Use:! 0! ! mh 1 1 EX'mloimh2! ! 



1 

1 

k 

\ 

i 

i 


1 

I 

1 

i 

i 

i 


R9=R9 .OR. RC (Merge mantissa chunks.) 
Register:! RE! RD! RC! RB! RA! R9! R8! 

t i i i i i i i 

R7 

R6 

R5! 

1 

R4i 

i 

R3 

R2i 

i 

R1 ! 

i 

ROi 

t i i i i t i i 

Use:' 0! ! - ! EX!mlo|mhi! | 



1 

1 

1 

i 

i 

i 


i 

i 

i 

t 

i 

i 



5. (Create true mantissa-. 75 . Then conceptually multiply result by 2 to 

create U. Radix point will then be at left edge of R9.) 

RDs.NOT. R9 . (Lead bit of RD, RD(0), now contains sign bit of U.) 

RD=RD .AND. X'SOOO* . (Clears all but U sign bit.) 

If RD=0 

(Sign bit of U is 0, i.e., +.) 

R9=R9 .EXCLUSIVE OR. X'SOOO' . (Clear lead bit of R9 when 

true mantissa-. 75 is + ; creates 
Uhi. Ulo already exists. Uhi,Ulo 
is the magnitude of U.) 

Else 

R9sR9 .EXCLUSIVE OR. X'7FFF' . (Clear lead bit of R9 when 

true mantissa-. 75 is complement 
remaining bits of R9 to create 
Uhi. Now proceed to complement 
RA which becomes Ulo. Uhi, Ulo 
is the magnitude of U.) 

RAsRA .EXCLUSIVE OR. X’FFFF* . (Complement complete.) 


Register: ! 

1 

RE! 

i 

RD! RC! 

i i 

RB 

RA 

R9! R8 ! R7! R6! R5! R4 { R3 ! R2 ! R1 i R0 ! 

1 1 1 1 1 I 1 1 1 1 

1 

Use: ! 

0! 

i i 

SU! - ! 

EX 

Ulo 

1 1 1 1 1 1 i 1 1 t 

Uhi! !!!!!!!! ! 


6. RCsX'0001 ' 

(Detect 

odd/even character 

of X exponent.) 

Register: ! 

RE! RD! 

! ft 1 

RC! RB! RA! R9! R8 
! ! ! ! 

R7! R6 ! R5! R4 ! R3! R2 ! R1 ! R0 ! 

1 1 i 1 1 1 1 1 

Use: ! 

1 v | 

0! su{ 

i i i i 

1 ! EX! Ulo! Uhi! 

1 1 i I 1 1 i i 
I 1 1 t 1 1 1 1 
1 1 1 1 1 1 1 1 
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RCsRC .AND. 

RB 

(RC*1 if EX 

is 

odd; 

RC=0 if 

EX 

is even.) 

Register: i 

t 

RE! 

! 

RDi RC i RB! 
0! ! { 

RA 

R9! 

j 

R8! R7! 

! i 

R6 

R5! R4 ! R3! *2! R1 ! R0 

lilt) 

i 

Use : ! 

0! 

SUJO/E! EX! 

Ulo 

Uhi! 

i i 

i t 


1 1 1 1 1 

1 1 1 1 1 

1 l i i I 


If RC*0 













Register: ! 

1 

RE! 

1 

RD| RC! 

RB! RA! R9! 

j i j 

R8! 

R7! 

R6 i 
1 

R5 

R4 

R3! 

1 

R2! 

1 

R1 

R0 

1 

Use: i 

1 

o! 

u i i 

SU! 0! 

EX! Ulo! Uhi! 



1 

1 

1 



1 

1 

1 

1 

1 

1 



R7=N1 

(Set degree for calculating n 

Pi" 

(even EX) . ) 




R8sC0EF1+4*N1+2 

(Set to last 

16 

bit 

address 

of C0EF1 

block.) 

Register: i 
1 

RE! 

1 

RDI RC! 
n» ■ 

RB! RA! R9! 

1 t 1 

R8! 

j 

R7! 

R6j 

i 

R5 

R4 

R3! 

1 

R2i 

1 

R1 

R0 

1 

Use: ! 

1 

0! 

U i t 

SU! 0! 

1 1 1 
EX! Ulo! Uhi! 

loc! 

cntj 

i 

i 

i 



1 

1 

1 

1 

1 

1 



BAL,RF 

POLY32. 

(Compute " 

pi" 

polynomial.) 





Register : ! 

RE! 

RD! RC! 

RB! RA! R9! 

R8! 

R7! 

R6! 

R5 

R4 

R3! 

R2 } 

R1 

R0 

1 

1 

Use : ( 

i 

i 

0! ! 

! ! ! 



1 

1 



1 

1 

Pi! 


Pi 

oi 

SU! 0! 

EX! Ulo! Uhi! 

- ! 

- ! 

1 

1 


- 

1 

hi! 

- 

lo 


(Warning: Polynomial coefficients must be chosen so that the maximum 
value of pi is X’FFFFFFFF’ for the maximum input XrX'FFFFFFFF* , Also, 
polynomial coefficients must be chosen so that the minimum 
value of pi is X'COOOOOOO' for the minimum input X=.5 . ) 

R2sR2+X , 4000' (Unbias pi to create p1+.25=Y true mantissa-. 5 

Result is positive; lead sign bit is 0. Radix 
point is at left edge of R2.) 

RB=RB- 128+0+256 (Preliminary development for biased "Y M exponent.) 
(RB, RC)=RB*X , 0040 , , 2's complement multiply. 

(Develop W Y W exponent and sign bit.) 
(R2, R3)=R2 # X , 0100* , cardinal multiply. (Line up "Y" mantissa hi.) 
(R0, R1 )sR0*X'0100 f , cardinal multiply. (Line up "Y" mantissa lo.) 
R2sR2 .OR. RC (Merge M Y H sign, exponent, and mantissa hi.) 

R0=R0 .OR. R3 (Merge "Y" mantissa lol with mantissa lo2.) 

RETURN (by way of RF) . 


Register: ! 

i 

RE 

! RD! 
I n ! 

RC! 

i 

RB! RA! R9! 

i t t 

R8! 

1 

R7! R6! 

1 i 

R5! R4 ! R3! 
1 1 1 

R2 ! 

i 

R1 

! RO 

i 

Use: ! 

0 

U I 

! SU! 

i 

0! 

i i i 

EX! Ulo! Uhi! 

1 

_ 1 

1 1 
*■* a 1 

1 1 1 
1 _ 1 _ 1 

i 

Yhi! 

- 

! Ylo 

Else 

Register: i 
1 

RE 

RD! 

A 1 

RC! 

RB! RA! R9! 

j | j 

R8 ! 

i 

R7! R6! 

i i 

R5! R4! R3! 
1 1 1 

R2 ! 

i 

R1 

! RO 

1 

Use: S 

0 

U | 

SU! 

f 

0! 

EX! Ulo! Uhi! 

i 

i 

i 

i t 

! ! 

t 1 1 

i i ! 

i 

i 

i 



-7sN2 

(Set degree for calculating "p2 w 

(odd EX).) 
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R8*C0EF2+4*N2+2 

(Set to last 

16 bit 

address 

of 

C0EF2 

block.) 


Register: j RE j 

RD! RCj 
n» ! 

RB! RAi R9! 

1 1 t 

R8! 

t 

R7. 

R6| R5 ! 

t i 

R4 

! R3! 
1 1 

R2! 

i 

R1 1 

i 

R0! 

i i 

Use:' 0j 

U 1 i 

SU| 0! 

1 1 1 
EX! Ulo! Uhl ! 

i 

loci 

cnt 

i t 

1 ! 


1 1 
! ! 

i 

i 

i 

i 

i 

i 


BAL,RF 

POLY32. 

(Compute M 

p2" 

polynomial .) 






Register:! RE| 

RD! RCj 

RB! RAi R9! 

R8! 

R7 

R6! R5! 

R4 

! R3! 

R2 ! 

R1 1 

R0! 

1 1 
1 1 

0! ! 

i ! ! 



t i 

i i 


1 1 
1 1 

P2! 

i 

i 

p2! 

Use : ! 0 j 

SU! 0! 

EXiUloiUhi! 

- j 

- 

! ! 

- 

1 _ 1 

hi! 

i 

lo! 


(Warning: Polynomial coefficients must be chosen so that the minimum 
value of p2 is X'COOOOOOO 1 for the minimum input Xs.25 .) 

R2sR2+X , 4000' (Unbiaa p2 to create p2+.25=Y true mantissa-. 5 . 

Result is positive; lead sign bit is 0. Radix 
point is at left edge of R2.) 

RBaRB-128+1+256 (Preliminary development for biased W Y M exponent.) 
(RB, RC)sRB*X , 0040' , 2's complement multiply. 

( .) 
(R2, R 3)sR2*X , 0100' , cardinal multiply. (Line up W Y M mantissa hi.) 
(R0, R1 JsRO^X’OlOO' , cardinal multiply. (Line up "Y" mantissa lo.) 
R2*R2 .OR. RC (Merge "Y” sign, exponent, and mantissa hi.) 

ROsRO .OR. R3 (Merge "Y" mantissa lol with mantissa lo2.) 

RETURN (by way of RF). 


Register:! REj 
1 1 

RD| 

fl ! 

RC| 

» 

RB 

RA 

R9! R8! R7! R6! 
1 1 1 1 

R5! R4 } R3! R2! 
1 1 1 1 

Rl! R0 

t 

1 f 

Use : i 0 ! 

V J 

SU! 

i 

0! 

EX 

Ulo 

till 

Uhi! - ! - ! ! 

1111 
! - ! - ! Yhi! 

i 

- ! Ylo 


End If. 


END 
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3.2.2 MCU SINE SUBROUTINE DESCRIPTION : SINM 


Subroutine develops the value, "Y M , the sine of the input variable, W X". 
"X n , the input, and w Y n , the output, are 32 bit VAX floating point numbers. 
Along with "Y w , a 16 bit status value, S, is generated for output; it is 
0 when iXj is less than 2**24. When iXi is equal to or greater than 2**24, 
the angular uncertainty is on the order of 2 radians and so the Y answer 
becomes meaningless; the status is set to 3 in such case (no processing 
is performed) . 

The "Y n value range is -1 <= Y <* +1 


The subroutine demands the calculation of 

Y*SIN(X)*( (-1 ) **SX)*SIN( j X | ) where 

SX is the value of the sign bit of X, and 
! X ! symbolizes the absolute value of X. 

5 Y i must be computed using a number of different approximations. For true X 
exponents (TEX) less than 0, the ! X S interval is dissected into 3 different 
sub-intervals, namely, 


1) 


TEX 

< 

(-10) 

or 

; xi 

< 

2* • ( — 1 1 ) 

2) 

(-10) 

<s TEX 

< 

( -4) 

or 

2**(-1 1 ) <= !X| 

< 

2**( -5) 

and 3) 

( -4) 

<= TEX 

< 

( 0) 

or 

2**( -5) <= IX 1 

< 

2**( -1) 


For- true X exponents greater than or equal to 0, } Y ! is computed only after 
converting X to units of 1/4 rotations and then converting this resultant 
value, R, to an integer form. To find |Y| in such case, the fractional part 
of R, Rf , is first dissected into 8 equal sub-intervals. For the appropriate 
sub-interval "j", the associated polynomial, sj, is used to approximate 
( (SIN(Rf*(PI/2) ) )/2)-.25 . Then, SIN(ARG) is approximated by 2*(sj+.5). 

For odd quadrants, 

ARG=( 1+Rf )«PI/2 . 

For even quadrants, 

ARG=(Rf )«PI/2 . 

I Yi s2*(sj+.5) . 

The final Y value is determined by the jXi quadrant index and the sign of X. 
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Specifically, 

I Yi«2 # (sj+.5) and the sign of Y, SY, is 
SYsSX .EXCLUSIVE OR. Qhi .EXCLUSIVE OR. Qlo where 
SX is the sign of X, 

Qhi is the top bit of the jXl quadrant index, and 
Qlo is the bottom bit of the | X { quadrant index. 

The specific approximations for the 3 I X | sub-intervals corresponding to 
negative X exponents are listed below. Then, the 8 approximations used to 
approximate ( (SIN(Rf»(PI/2) ))/2)-.25 are listed. 


APPROXIMATIONS 


Sub-interval 1 ) 


When the true exponent of X is less than -10, 

i vt • i yi 

| I | * | A | e 


Sub-interval 2) 

When the true exponent of X is less than -4 but greater than or equal 
to -10, !Yi is given by 

!Y!*',Xj-< !X!»»3)/6 and so by 

{ Y } * i X ! * ( 1-(2/3)*X2) where 

X2s( | X * /2)**2. 

P0LY32, the polynomial expansion routine, is not uud to compute this 
approximation. 


Sub-interval 3) 


When the true exponent of X is less than 0 but greater than or equal 
to -4, |Y! is found using 

! Y ! * i X i * ( 1 — G ( } X { /2 ) ) where 
G( !X!/2)s1-(SIN( I X ! ) ) / ! X i 

The value of G(iX|/2) is established using the polynomial, w p1(U)", given by 
p1(U)sA10*(U**0)+A1 1 # (U**1 )-fA12 # (U B *2HA1 3*(U**3)+ ♦A1N1»(U«*N1 ) where 
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U-IXI/2 , 

the I X 1 range is 1/32 <* i X ! < .5 , 

the U range ia 1/64 <* U < .25 , and 

pi approximates (G(!X!/2))/2 . 

The polynomial "pi" ia computed from right to left ua<ng 

p1(U)*A10+U»(A1 WU»(A12+U«(A13«-U i (Al4+U»(A15+U»(Al6+ ♦U # (A1N) . 

The POLY32 routine used to compute pi assumes that 1/64 <« U < 1/4 and 
"U" has the signed magnitude format [S, (0.31 .0) 3*2**(-32) and that 
pi lies within the range -1/4 <* pi <s 1/4 (it does) and has the 2'a 
complement format ( 1 .31 .0)*2**(-32) . 

The starting location of the memory space that stores the "A1" coefficients 
needed to compute pi is C0EF1. The coefficient data are assumed stored in 
the sequence: 

Address Item 

COEFW 0 A 1 0 ( hi ) 

C0EF1+ 2 A10(lo) 

COEFW 4 All (HI) 

COEFW 6 A11(lo) 

COEFW 8 A12(hi) 

COEFW 10 A12(lo) 

COEFW 12 A1 3(hi) 

COEFW 14 A1 3(lo) 


• e 

COEF W4*N1 AIN(hi) 

COEF W4*N W2 AIN(lo) . 

N1, the degree of the "pi" polynomial, and the COEF1 coefficient data are 
defined within this subroutine. 

Once pi is computed, the output !Y| value is given by 
! Y | a J Xl *( 1-2*p1 ) . 


X s> .5 approximations 


When the true exponent of X is 0 or greater, jYi is found by first converting 
the angle {X{ from a radian measure to a 1/4 rotations measure. The 
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converted angle is called ”iRj" and is described in terms of units of 
radians. jR! t is defined by 

iRi s’iXl /(PI/2)*|X|*(2/PI/) (Unlts«1/4 rotations). 

The fractional part of !R|, Rf, is dissected into 8 equal sized sub-intervals 
that are indexed from 0 through 7. For each sub-interval, the SIN( (PI/2)*Rf) 
is approximated using the particular polynomial associated with the 
sub-interval. The w J"th (J«0, 1,..., 7) sub-interval approximation of 
SIN( (PI/2) ff Rf) is described in terms of the polynomial sJ(U) where U is 
related to Rf using 

Rf*J/8*(U/8) where 
0 <* U < 1 . 

The polynomial, n sJ(U) w , is defined by 

sj(U)*BJ0*(U**0)-fBj 1 # (U ## 1 )+BJ2 # (U #tf 2)*BJ3 # (U* # 3) + *BjMj»(U”Mj) where 

the Rf range is * '8 <* Rf < (j*1)/8 , J*1,...,7 , 

the U range is 0 <» U < 1 , and 

sj(U) approximates (SIN( (Rf»PI/2) ) )/2 . 

The polynomial "sj" is computed from right to left using 

s j *B J0+U« (BJ1*U § ( BJ2*U # ( B.1 3*U« ( BJ4+U* (Bj5*U* (Bj6+ *U*(BjMj) . 

The P0LY32 routine used to compute sj assumes that 0 <= U < 1 and 

"U" has the signed magnitude format [S, (0.31 .0) ]*2 # *(-32) and that 
sj lies in the range -.5 <* sj <* .5 (it dees) and has the 2’s 
complement format (1.31.0) »2** ( — 32 ) . 

The starting location of the memory space that stores the "Bj" 
coefficients needed to compute sj is KOEFj. The coefficient data 
are assumed stored in the sequence: 

Address Item 

K0EFj+ 0 BjO(hi) 

KOEFJ* 2 BJO(lo) 

KOEFj* 4 Bjl(hi) 

KOEFj* 6 Bjl(lo) 

KOEFj* 8 BJ2(hi) 

KOEFj* 10 BJ2(lo) 

KOEFj* 12 BJ3(hi) 

KOEFJ* 14 BJ3(lo) 


• 9 

KOEFj+4»N3 BJN(hi) 
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KOEFj+4*N3+2 BjN(lo) . 

Mj, the degree of the w sj w polynomial, and the KOEFj coefficient data are 
defined within this subroutine. 

Once sj is computed, the output SIN(Rf*(PI/2) ) value is given by 
SIN(Rf*(PI/2) )=2*sj . 


For the "pi" and "sj" polynomials above, coefficients of the 
polynomial have the 2's complement format ( 1 .31 .0)*2**(-32) , the same 
format as ts used for the polynomial value. 

The degree of the polynomials is at least 1 . 


The entry branch and link register for this subroutine is RF. The subroutine 
calls the POLY22 subroutine by way of register RF. The POLY32 subroutine , in 
turn, calls the subroutine, MULT32.MS, as an internal subroutine (i.e., no BAL 
register is used) . 

Registers directly required by this subroutine are marked with a "* M . 

Registers indirectly required by the POLY32 routine are marked with a "#". 
Registers indirectly required by the MULT32.MS routine are marked with a "$ w . 


Register: 


POLY32 Use: 
MULT32 Use: 


RE! 


RDj 


RC! 


RB| 


R9! R8i R7i 


R5i 


R4 1 R3i R2i R1| RO J 



#!*!»'» 


» ! 


! # 1 
I * ! 


«!•{*; 

# ! ! # ! 
$!$!$! 


ON ENTRY: 
R9*Xlo 
RBsXhi 

ON EXIT: 
R0=Ylo 
R2=Yhi 
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• •#*•»*»**<»»»*««•*•*»•*••»*** •§*•*•*••••»•**•** 

SINM entry 


1 . SINH entry . 


Register:! RE! RD 

i i i 

RC! RB! RA! R9i R8 ! R7! 

i 

R6! 

i 

R5 ! R4i R3! 

i i i 

R2i R1 

i 

! RO! 

i i 

i i i 

Use:! ! ! 

! Xhi! i'XIo! ! ! 

i 

i 

i 

i i i 
t t t 
i i i 

i 

i 

i 

i i 

i i 

i i 

2. ROsX'4000' . 

(Capture complement of 

true 

exponent sign bit 

in RO*) 

Register:! RE! RD! 

i t i 

RC! RB! RA! R9i R8! R7! 

1 i i 1 I i 

R6! 

i 

R5! R4 } R3! 

t 1 1 

R2i R1 

i 

! RO! 

i j 

i i i 

Use:! ! ! 

i I i l I l 

! Xhi! ! Xlo ! ! ! 

i 

* 

i 

1 1 1 
1 1 1 
1 1 1 

i 

i 

» 

! Use! 

3. RO=RO .AND. RB 

(Complement of sign of X true exponent 

in RO 

after step.) 

Register:! RE S RD! 

i i i 

RC 1 , RB! RA! R9l R8 } R7! 

i » i i i i 

R6! 

1 

R5i R4! R3! 

t 1 1 

R2 1 R1 

i 

! RO! 

1 i 

1 1 1 

Use:! ! i 

t i i i i i 

! Xhi ! i XI o ! ! ! 

1 

1 

1 

1 I 1 
1 1 1 
1 1 1 

i 

i 

i 

iSEXi 


4. IF R0=0 (Split processing based on true exponent (-) or (+,0).) 


Register: 

! RE! RD! RC! RB! RA i R9l 

i i i i i i i 

R8! R7! 

R6! R5! 
1 1 

R4 ! 

t 

R3i 

i 

R2 ! 
1 

ri i 
* 

R0 

Use: 

i i i i i t i 

! ! ! ! Xhi! !Xlo! 

i ! 
1 1 
1 1 

1 1 
1 1 
1 1 

1 

1 

1 

i 

i 

i 

1 

1 

1 

i 

i 

i 

- 


(The true X exponent must lie in 

the range, 

-128 

< = 

EXP- 

128 


to branch in this direction. 








Develop X sign bit in R2(0) and 

"0" in other 

15 

bits 

.) 


4.0 

R2=X , 8000* . 









R2=R2 .AND. RB . (Sign 

bit now 

in R2(0) 

.) 





Register: 

! RE! RD! RC} RB! RA', R9i 

■ t i t i i t 

R8! R7! 

R6! R5i 

R4! 

R3! 

R2i 

Ri! 

R0 

Use: 

i i i i i i i 

! ! ! ! Xhi! {Xlo! 

1 1 

1 I 

i i 

i i 

1 

1 

i 

i 

SX! 

! 

- 

4.2 

RB=RB .EXCLUSIVE OR, R2 

. (Zero 

*3 out X 

sign bit in 

RB; 




X becomes 'X 

.) 





Register: 

! RE! RD! RC! RB! RA! R9! 

R8 1 R7! 

P^! R5! 

R4 ! 

R3! 

R2 i 

RI ! 

R0 


1 1 1 1 1 v • * 1 1 V • • 

1 1 1 1 1 A | | II A l | 

i i 

i t 

1 1 
1 1 

1 

1 

\ 

1 

1 

t 

i 

i 


Use: 

! ! ! ! hi! ! lo! 

i i 

i i 

1 1 
1 1 

1 

1 

1 

1 

SX! 

i 

i 

- 


(^termine if X true exponent is 

less than - 

10.) 





4.4 R0=RB . (Replicate { X ! high.) 

R0sR0-( ( 128—10)* 1 28 ) . (Remove exponent bias; add 10 to result.) 

ROsRO .AND. X'FFSO* . (Clear mantissa bits out of RO; RO value is 
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(true X exponent + 10) *128 ; final result 



is 

T1 .) 



Register: I 

REj RDi RC! RB! 

i i 

RA! R9i R8j 
! ! y! ! ! 

R7I R6 I R5! R4i R3! R2! 

i i i i i i 

R1| RO 

t 

Use: 1 

! 1 ! hi! 

I 1 A| | 1 

! lo! ! 

t i i i i i 

} i s I i sx: 

i 

! T1 


4.6 If ROsnegative 

(X true exponent is less than -10, T1 < 0; Y=X.) 

4.6.0 R2=R2 .OR. RB . (Re-insert sign bit into jXj high in R2. 

YhisXhi. ) 


R0=R9 . ( YlosXlo.) 

REsO . (Set status.) 
RETURN (by way of RF). 


Register: ! 

RE! RDS 

RC! RB! 

RA! R9I R8 i R7! R6! R5i 

R4 ! R3! R2 ! 

Rl! RO 


1 1 
1 1 

i lyl i 
1 |A| 1 

1 lyl 1 1 1 1 1 

> 1A| | 1 | | 1 

1 1 1 
1 1 1 

1 

Use : ! 

• «#«« 

0! ! 

! hi! 

i lo! ! ! ! ! 

! ! Yhi ! 

! Ylo 


Else 

(X true exponent is greater than or = to -10 but less than 0. 




(0 <S T1 < 10).) 




Register: ! 

RE! 

RDi RC! RB! RA! R9! 

R8! R7! R6! R5i R4! R3 

R2! 

Rl! RO! 


1 

1 

1 1 I yl 1 II y 1 1 

1 1 t A| 1 | 1 A| | 

1 1 1 I 1 

1 1 ! 1 1 

1 

1 

i i 

i i 

Use: ! 

1 

1 

! ! hi! ! lo! 

1 1 1 1 1 
1 1 1 1 1 

sx! 

! Tl! 

4.6.1 


(RO, R1)=R0*X»0400 

* (card). (Let E2A=2*(X 

true 

exponent + 10); 




put E2A into 

RO.) 


Register: i 

RE! 

RD! RC! RB! RA} R9! 

R8 ! R7! R6 ! R5! R4 R3 

R2i 

Rl! RO! 


i 

i 

1 1 1 y> * 1 1 Y < * 

1 1 1 A | | 1 1 A | | 

i i i i i 

i i i i i 

i 

i 

1 1 
1 1 

Use: ! 

i 

i 

! ! hi! ! lo! 

i i i i i 

i i i i i 

SX! 

- ! E2A ! 


(Put X true mantissa magnitude into RC, RA.) 

4.6.3 (RB, RC)sRB*X , 0100 l (card) . (Put X true mantissa (high) into 

RC; radix point is on left edge 
of RC.) 

(R9, RA)sR9*X'0100' (card) . (Put X true mantissa (low) into 

R9, RA.) 

RC=RC .OR. R9 . (Merge high mantissa bits.) 

RCsRC .OR. X’8000' . (Insert lead M 1" bit into X mantissa; 

result is X true mantissa, mX, in 
(RC, RA) .) 


Register: 1 

RE! RD! RC! RB! 

RA! R9! 

R8! R7i R6! R5i R4 | R3i R2! 

Rl! RO! 


i i mX! ! 

mX! i 

!!!!!!! 

i i 

i i 

Use : ! 

' ! hi! - ! 

lo! - 1 

1 1 1 1 I 1 cy 1 

1 1 1 1 1 1 OA | 

- ! E2A ! 


(Replicate true mantissa (magnitude); put ( RC, RA) into (R5,R3).) 
4.6.5 R5=RC . 

R3=RA . 
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Register:! REi 

RD! RCj 

RB{ RA! R9! 

R8! R7! R6! R5! 

R4i R3! 

R2| 

R1 ! RO 

t i 

i i 

! mX! 

! mXi i 

! ! ! mX! 

i mXj 

i 

i 

i 

i 

Use:! ! 

! hi! 

- ! lo! - ! 

! i ! hi! 

! lo! 

SX| 

- ! E2A 


(Make ' X | /2 an integer. Its radix point is to be at the left 
edge of R6. If R6, R5, and R3 are considered contiguous, and 
the radix point now sits at the left edge of R6, then the X 
true mantissa magnitude has been multiplied by 2**(-l6). To 
make the integer jXj/2, the mX value should have been 
multiplied by 2**(tru exp-1) and not by 2**(-l6). To correct 
the (R6, R5, R3) value, it must be multiplied by 
2**(tru exp + 15) or, since the RO value divided by 2 is 
(tru exp + 10), by 2**( (R0value/2)+5) . 

The values of RO range from 2*0 to 2*9. For the various values 
of RO, the values of 2**( (R0value/2)+5) will be pulled from the 
MCU memory table, SHF . The table follows: 

SHF Table 


Location 

RO 

2**(R0/2) 

SHF 

♦ 0 

2*0 

X 

0001' 

SHF 

+ 2 

2*1 

X 

0002' 

SHF 

+ 4 

2*2 

X 

0004' 

SHF 

+ 6 

2*3 

X 

0008 * 

SHF 

+ 8 

2*4 

X 

0010' 

SHF 

+10 

2*5 

X 

0020 ' 

SHF 

+12 

2*6 

X 

0040 * 

SHF 

+14 

2*7 

X 

0080* 

SHF 

+16 

2*8 

X 

0100* 

SHF 

+18 

2*9 

X 

0200' 

SHF 

+20 

2*10 

X 

0400' 

SHF 

+22 

2*11 

X 

0800' 

SHF 

+24 

2*12 

X 

1000' 

SHF 

+26 

2*13 

X 

2000' 

SHF 

+28 

2*14 

X 

4000' 

SHF 

+30 

2*15 

X 

8000' 


4.6.7 R0=R0+SHF5 . (SHF5=SHF+2*5 . Point to correct location in 

shift table. PNT is the pointer value in RO. 
E+S=2*(true X exponent* 1 0+SHF5 ) =PNT. ) 

R1s0(R0) . (Put the scaling value, SCL, in memory location 

referenced by RO into R1.) 


Register: i 

RE! RD! RC! 

RB| RA! R9i 

R8j R7! R6i R5l 

R4J R3 ! 

R2 

R1 

R0 


! ! mXj 

! mX! ! 

! ! ! mX! 

! mXi 



PNT 

Use: ! 

! ! hi! 

- ! lo! - ! 

! ! i hi! 

! lo! 

SX 

SCL 

E+S 


4.6.9 (R5» R6)=R5*R1 (card) . (Shift X true mantissa high according 

to SCL; create higher part of iX'/2 .) 
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(R3, R4)=R3*R1 (card) . (Shift X true mantissa low according 
to SCL; create lower part of lX|/2 .) 

R3=R3 .OR. R6 . (Merge low parts of integer iXj/2; integer 
|X|/2 is in (R5, R3) after step. Radix point 
is at left edge of R5.) 


Register:! RE j 

RDS RC! 

RB! RAS R9i 

R8! R7! R6 ! R5! 

R4! R3! 

R2 1 R1 

R0 

1 1 
1 1 

! mX! 

! mX! i 

! ! i int! 

i int ! 

i 

i 

PNT 

Use:! ! 

! hi! 

- ! lo! - ! 

1 1 1 IV' ' 

1 l“l 1 A | | 

- i i X S i 

SXISCL 

E+S 

i t 

i i 

1 1 
1 1 

1 

1 

! ! i /2! 

! /2! 

\ 

i 


i i 

i i 

1 

1 

1 1 
1 1 

! ! ! hi! 

! lo! 

i 

i 



(Split processing where X true exponent is less than -4.) 

4.6.B R0=R0-(SHF5+2*( 10-4) ) . (R0/2=tru exp + 4; E24=2*(tru exp + 4).) 


Register:! RE! 

RD! RC! 

RBI RAS R9! 

R8! R7i R6! R5i 

R4! R31 

R2 i 

Rl! R0 ! 

1 

1 

i 

i 

i mX! ! 

! ! ! int ! 

! int S 

1 

1 

1 1 
1 1 

Use:! ! 

! hi! 

- ! lo! - ! 

1 1 1 lyl 1 

1 1 “ 1 1 A 1 1 

I 1 y 1 I 
“ 1 |A| | 

SX! 

- !E24i 

1 1 
1 1 

i i 

i i 

i i i 

i i i 

! i S /2 S 

! /2! 

i 

i 

i 1 

1 1 

1 1 

1 t 

i i 

i i 

i i i 

i i i 

! ! ! hi! 

! lo! 

i 

* 

1 1 
1 1 


4.6. D If ROsnegative 

(X true exponent is less than -4 but greater than or 
equal to -10. Use |Y!=jX!»(1-(2/3)*( lX!/2)*»2) for 
computing SYS. 

Square integer !Xi/2 1st.) 

4.6. D.0 (R3, R4)rR3*R5 (card) . (Multiply lo times hi (int X).) 

(R5 f R6)=R5*R5 (card) . (Multiply hi times hi (int X).) 
R6=R6+R3 ,save carry out . (Partial square lo.) 
R5=R5+carry in . (Partial square hi.) 

R3=R3+R6 ,save carry out . (Full square lo.) 




R5=R5+carry 

in 

. (Full square 

hi.) 



Register:! RE! 

RD! RC! 

RB! 

RA! R9! 

R8 i 

R7i R6S R5! R4 

! R3! 

R21 

R1 ! R0 

1 1 
1 1 

! mX! 

i 

i 

mX! i 

i 

i 

! !int! 

! int ! 

i 

i 

I 

1 

Use:! ! 

! hi! 

• j 

lo| - ! 

i 

i 

1 1 1 Y 1 1 

I • l 1 A i i — 

I 1 v 1 I 

II A 1 1 

sxi 

- ! E24 

i i 

i i 

1 1 
1 1 

i 

i 

1 1 

t 1 

I 

l 

! ! /2! 

! /2i 

i 

i 

1 

1 

i i 

i i 

1 1 

t 1 

i 

i 

1 I 

1 1 

i 

» 

i ! sqr! 

i sqr ! 

i 

i 

1 

1 

l i 

t i 

1 1 
1 1 

i 

i 

1 1 

t 1 

l 

1 

! ! hi! 

! lo! 

i 

i 

1 

1 


(Square of integer iXl/2 is in (R5, R3); radix point is at 
left edge of R5. Multiply square times (2/3) = . 6666. . . and 
call the result M D" . Then subtract ”D" , namely , 

( .6666. . .*( iX! /2)**2) from 1 and call the result C.) 

4.6.D.2 (R3, R4)sR 3 # X , AAAA’ (card) . (Multiply lo (int jXi/2 sqr) 

times both (2/3) hi and 
(2/3) lo. The hi and lo 
part of (2/3)=X f AAAA* ; 
radix point of (2/3) hi 
is at left edge of 


- 3-19 - 


MPP SCIENTIFIC SUBROUTINES GOODYEAR AEROSPACE 

CORPORATION 
GER-1 7221 


at X'AAAA*.) 

(R5, R6)sR5»X'AAAA' (card) . (Multiply hi (int iXj/2 sqr) 

times both (2/3) hi and 
(2/3) lo.) 

R6=R6+R5 ,save carry out . (Add for partial multiply, lo.) 
R5sR5+carry in . (Add for partial multiply hi.) 

R3*R3+R6 ,save carry out . (Add for full multiply, Dio.) 
R5=R5+carry in . (Add for full multiply, Dhi.) 


Register: ! 

RE! 

RD| RCi RBI 

RA| 

R9 ! 

R8! 

R7! 

R6! 

R5 

R4 ! 

R3 

! R2i 

R1 

! RO 

i 

t 

i 

i 

i mX{ i 

mX| 

1 

1 

1 

1 

1 

1 



1 

1 





Use: ! 

i 

i 

i hi! - j 

lo! 

_ 1 

1 

1 

1 

1 

- ! 

Dhi 

- ! 

Dio 

! SX! 

- 

! E24 

("C" 

is now in (R5, 

R3). 

Develop 

C=1 

-D= 1 

-(X**2)/6 

.) 



4.6.D.4 

R5= 

:.N0T. R5 . 

(1- 

(X**2)/6 

hi* 

) 








R3= 

:.N0T. R3 . 

(1- 

(X**2)/6 

lo. 

"C" 

is 

now 

in 

(R5, 

R3) 

.) 

Register: i 

RE| 

RD| RC! RBI 

RA! 

R9! 

R8! 

R7! 

R6! 

R5 

R4 1 

R3 

! R2i 

R1 

! RO 

1 

1 

1 

1 

! mX| | 

mX! 

1 

1 

1 

1 

i 

i 



i 

i 




1 

1 

Use:! 

1 

1 

! hi! - ! 

lo ! 

1 

1 

1 

i 

i 

- i 

Chi 

- j 

Clo 

! SX! 

- 

! E24 


(Multiply the i X S true mantissa times M C M to get the 
value, mX*( 1-(X**2)/6) . Call the result mV. Put into 
(R5, R3).) 

4.6.D.6 (RA, RB)=RA*R5 (card) . (Multiply mXlo times Chi.) 

(R5, R6)=R5*RC (card) . (Multiply Chi times mXhi.) 
RA=RA+R6 ,save carry out . (Add for partial multiply, lo.) 
R5=R5+carry in . (Add for partial multiply hi.) 

(R3, R4)=R3*RC (card) . (Multiply Clo times mXhi.) 
R3=R3+RA ,save carry out . (Add for full multiply; mVlo.) 



R5=R5+carry in 

. (Add 

for 

full multiply; 

mVhi.) 


Register: ! 

RE! RD ! RC! RB! RA! 

R9! R8! 

R7! 

R6i R5! R4! 

R3 

R2 ! R1 

RO 


i 1 1 1 1 

1 1 1 1 1 

1 1 
1 1 

1 

1 

! mV! ! 

mV 

1 

1 


Use: j 

\ 1 ^ 1 _ 1 _ 1 

^ 1 1 

1 

1 

- ! hi! - ! 

lo 

SX! - 

E24 


(If R5 lead bit is 1 (i.e., if it looks negative, the 
value is in the mantissa range of the output Y. Else, 
mV is a hair below .5 and needs to be multiplied by 2. 
Fix and produce output.) 

4.6. D.8 If RSsnegative (mV=.5 test . ) 

(Pseudonegative low likelihood branch direction; 
the mV magnitude is s> ,5 .) 

4. 6. D. 8.0 R0=R0+2*( 128-4-1 ) (((Biased exponent of Y)-1)*2; 

to align exponent, need to 
multiply R0 by 64.) 

(R5, R6)=R5*256 (Y true mantissa hi 8 bits 
properly positioned in R5.) 

(R3, R4)sR 3*256 (Y true mantissa lowest 16 bits 
properly positioned in R3.) 
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Else 

4. 6. D. 8.1 R0sR0+2* ( 1 28-5- 1 ) (((Biased exponent of Y)-1)*2; 

t align exponent, need to 
multiply RO by 64.) 

(R5, R6)=R5 # 512 (Y true mantissa hi 8 bits 
properly positioned in R5.) 

(R3, R4)=R3*512 (Y true mantissa lowest 16 bits 
properly positioned in R3.) 

End (4.6.D.8) If. 


4.6.D.A 


Register: 


Use: 


RE! 


RDi 


RC 


(Fix exponent.) 

(RO, R1)=R0*X , 0040' . (Properly positioned biased 

Y exponent is now in R1.) 

(Fix mantissa.) 

R3=R3 .OR. R6 . (Merge lo bits of Y mantissa; Ylo.) 
R5=R5 .AND. X f FF7F' (Clear lead mantissa bit.) 

R2=R2 .OR. R5 (Sign and biased mantissa merge.) 
R2sR2+R1 (Add aligned biased Y exponent to aligned 
iYhi! mantissa.) 

R0=R3 (Move Ylo into R1.) 

REsO . (Status.) 

RETURN (by way of RF) . 


RB 


RA 


R9! R8 J R7! R6 


R5 


R4 


R3i R2 

j 

- IYhi 


R1 


RO 


Ylo 


Else 


(IX! 


true exponent is less than 0 but greater than or 



equal to 

-4. Use 

!Y|«!X!*(i-G(! 

X!/2) ) 

for comput 

Register: ! 

RE! RP, RC! RB! RA! 

R9i R8| 

R7! R6 ! R5! R4 

! R3! 

R2! Rl! RO! 

i 

i 

! ! mX| i mXi 

i i 

i i 

1 lint! 

lint! 

i i t 

i i i 

Use: j 

! ! hi! - ! lo! 

t i 

• i i 

1 _ 1 1 v * • 

1 - | l A t | - 

1 Ivl 1 
1 1 A || 

SX j - | E24 J 

1 

i 

! ! ! ! 1 

i i 

i i 

! ! /2! 

I /2! 

i i i 

i i i 

1 

1 

t 1 1 1 1 

1 1 1 1 1 

i i 

i i 

I ! hi! 

i lo! 

i i i 

i i i 

4.6.D.1 

R6=RA . 

(Save | 

X| true mantissa low 

in R6.) 


RB=R0 . (Save 2* ( 1 X { true exponent + 4) in RB.) 

REsR2 . (Save X sign in RE.) 

R9=R5 . (Move int |Xj/2 hi (Uhi) into R9.) 

RAsR3 . (Move int jX|/2 lo (Ulo) into RA.) 

RDsO . (Load sign bit of U, a magnitude, into RD.) 
R7sN1 (Set degree for calculating "pi" where pi 

approximates (G( iX|/2))/2*( 1-SIN( !Xj )/|X|)/2 
R8 sC 0EF1+4 # N1+2 (Set to last 16 bit address of C0EF1 
block.) 


.) 
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Register: i 

RE! 

RDI 

RCi RBj 

RA | R9! 

R8 ! 

R7I 

R6i R5! R4 

R3I R2 

Hlj RO 

I 

1 

01 

mXi j 

int j int ! 

1 

1 

mXi ! 

i 

! 

Use: 1 

SX| 

SU| 

hi ! E24 1 

iXliiXI! 

loci 

cnti 

lol - I - 

- ! - 

- 1 - 

1 

1 

1 

1 

1 

1 

1 

1 1 

/2| /2! 
'o| hi! 

1 

1 

1 

1 

I 

1 

1 

1 

1 

1 

! 1 
I 1 

1 I 

i 

i 

i 

t 

• 

i 

i 

i 

i 

i 

1 

1 

1 

1 

1 

1 

i i 

i i 

UlojUhi! 

1 

1 

1 

1 

1 

1 

1 1 
! 1 

i 

i 

i 

i 

i 

i 


4.6. D. 3 

BAL,RF 


P0LY32. 

(Compute "pi 

" polynomial 

(G( 

IX! 

/2) 

) 

.) 

Register: i 

RE! RDi 

RCI 

RB 

RA! 

R9! 

R8{ R7I 

R6I R5! 

R4{ 

R3! 

R2 ! 

R1 

1 

1 

ROi 

i 

i 

i 0! 

mXj 


int! 

int! 

! 1 

mXi { 

i 

i 

1 

1 

Pi! 


1 

1 

pi! 

Use: ! 

SXi SU! 

hi! 

E24 

ix! ! 

! X! ! 

- ! - ! 

lo! - ! 

« 1 

_ 1 

hi! 

- 

1 

1 

lo! 

i 

i 

1 ! 

1 


/2! 

/2! 

i t 

i i 

! 1 

i 

i 

1 

1 

1 

r 


1 

1 

1 

1 

1 

1 

1 

1 1 

1 t 

t 1 

1 

1 

1 


loi 

hi! 

i i 

i i 

i i 

1 1 
1 1 
1 1 

i 

i 

i 

1 

i 

i 

i 

i 

i 


1 

1 

1 

1 

1 

1 

1 

1 

1 1 
1 1 
1 1 

1 

! 


Uloi 

Uhi| 

i i 

• i 

i > 

1 1 
1 1 
1 1 

i 

i 

i 

I 

i 

t 

I 

i 

i 


1 

1 

1 

1 

1 

J 


(Generate Cs( 1-G( i X| /2) ) approximation. Note: G(|X|/2) 
radix point is 1 bit location right of the left edge 
of R2.) 

4.6.D.5 R2sR2 .EXCLUSIVE OR. X'7FFF» . (Use pi to approximate i J 

G(|X|/2). Radix point of 
pi (but not G) is at left 
edge of R2. Operation on 
(R2, RO) yields the hi 
part of (1-G(iX,'/2)), 
i.e., Chi, an unsigned 
number.) 

ROsRO .EXCLUSIVE OR. X'FFFF' . (Low part of ( 1-G( ! X! /2) ) ; 

Clo. ) 


Register: i 

RE! 

RD! 

RCi RB! RA! R9i 

R8! R7 ! R6! R5I 

R4! R3! R2! 

Rl! RO! 



i 

i 

mX! ! j ! 

! ! mX! ! 

! ! ! 

1 j 

Use: j 

SXi 

su| 

hi!E24! - ! - ! 

- ! - ! lo! - ! 

- 1 - {Chi ! 

- ! Clo } 


(Multiply the X true mantissa times "C" to get the value, 
mX*( 1-G( j X' /2) ) . Call the result mV. Put into (R2, RO).) 

4.6.D.7 (R6, R7)=R6»R2 (card) . (mXlo*Chi.) 

(RO, R1 )sR0*RC (card) . (Clo»mXhi.) 

(R2, R3)=R2*RC (card) . (Chi«mXhi.) 

R6sR6+R3, save carry out. (Combine lower bits of partial 

product, mVsmX*( 1-G( !Xj /2) ) . 
The radix point of mV is 1 bit 
right of the left edge of R2.) 
R2=R2+ carry in . (Combine upper bits of partial 

product, mVsmX*(1-G(|X}/2)).) 
R0sR0+R6, save carry out. (Combine lower bits of complete 

product, mV*aX»( 1-G( SXi/2) ) .) 
R2=R2+ carry in . (Combine upper bits of complete 
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product, mVsmX»(1-G( |X|/2)).) 

Register: j RE I RD S RCj RB | RAi R9 i R8| R7I R6I R5l R4| R3! R2i R1 1 ROi 

I I I S I I I I f I I ! aV| | mV| 

Use:! SXi SU| - |E24| - I - ! - i - ! - ! - ! - I - 1 hi! - j lol 

(If R2(1) bit is 1, the value is in the mantissa range 

of the output Y, Else, mV is a hair below .5 and needs 
to be multiplied by 2. Note: The radix point of mV is 
1 bit right of the left edge of R2. 

Fix and produce output.) 

4.6. D.9 If R2(1)=0 (Less than .5 test.) 

(The mV magnitude is < .5 branch direction.) 

4.6. D.9.0 RBsRB+2*( 128-5-1 ) (((Biased exponent of Y)-1)»2; 

to align exponent, need to 
multiply RO by 64.) 

(R2, R3)=R2*1024 (Y true mantissa hi 8 bits 
properly positioned in R2.) 
(RO, R1 )sR 0 # 1024 (Y true mantissa lowest 16 bits 

properly positioned in RO.) 


Else 

(The mV magnitude is s> .5 branch direction.) 

4. 6. D. 9.1 RE=RB+2»( 128-4-1) (((Biased exponent of Y)-1)»2; 

to align exponent, need to 
multiply RO by 64.) 

(R2, R3)*R2»512 (Y true mantissa hi 8 bits 
properly positioned in R2.) 

(RO, R1)sR0*512 (Y true mantissa lowest 16 bits 
properly positioned in RO.) 

End (4.6.D.9) If. 

(Fix exponent.) 

4.6. D.B (RB, RC)sRB*X , 0040' . (Properly positioned biased 

Y exponent is now in RB.) 

(Fix mantissa.) 

ROsRO .OR. R3 . (Merge lo bits of Y mantissa; Ylo.) 
R2=R2 .AND. X , FF7F* (Clear lead mantissa bit.) 

R2=R2+RC (Add aligned biased Y exponent to aligned 
S Yhi ! mantissa.) 

R2sR2 .OR. RE (Sign and iYhii merge.) 

RE*0 . (Status.) 

RETURN (by way of RF). 

Register:! RE! RD! RCj RBj RAi R9l R8i R7i R6j R5i R4j R3i R2| R1 j ROi 
! ! ! ! 1 ! ! ! • ! j j j j | | 

! - i - I Yhi! - i Ylo! 


Use:! - I - i 


I 

I 


I 

I 
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End (4.6.D) If. 


Else 


(X true exponent is 0 or ♦ starting here.) 

Register:! RE| RD { RCj RBI RA! R9i R8| R7! R6j R5! R4| R3i R2j R1 I ROi 

! I I i ! I I ! ! I I I ! ! I I 

Use:! I ! jXhil !xlo! ! I I i I ! ! I - ! 
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4.1 RE»X * 8000 ' , ( Prepare to capture X sign in RE(0); clear other bite.) 
RE* RE .AND, RB . (Sign of X in RE; other RE bits cleared.) 
RBsRB .EXCLUSIVE OR. RE . (X changed to a magnitude, IXl.) 

Register:! REl RD| RCI RB 1 RA| R9I R0I R7I R6| R5l R4| R3I R2| R1 ! RO ! 

I I i 1 1X1 1 1 IX! | I I I I I I I I 1 

Use:! SX| I I hi! ! lot i i I ! I I i I - I 


4.3 If RB s> ( 128+EXmax)*128 (EXmax=25. ! X I greater than or equal to 

2**24 test.) 

(!X| *> 2**24; too much angular uncertainty results. Return 
with status value of 3.) 

4.3.0 RE* 3 . (StatussS=3, Worthless results exist in Yhi and Ylo 

registers.) 

RETURN via RF . 

Register:! R2i RDi RCI RB i RAj R9! R8 i R7i R6 i R5! R4| R3i R2i R1 ! R0| 

I 3 I i I I I I I I I I I i ! i I 

Use:! Si i ! - I ! - I I I I t i ! ! I - ! 


Else 

( ! X $ < 2**24 along this path.) 

4.3.1 Continue. 

End (4,3) If 

(True exponent of |X|, TEX, lies in the range, 0 <= TEX <s 24 , 
along this path.) 

(Isolate | XI biased exponent (BEX) and biased mantissa (mx).) 

4.5 (RB, RC)sRB*X , 0200* . (|X| biased mantissa hi left justified in RC; 

|X| biased exponent, BEX, is in RB, right 
justified.) 

(R9, RA)*R9*X'0200' . (|Xi biased mantissa lo left justified in 

(R9, RA).) 

R9=R9 .OR. RC , ( ! X I biased mantissa hi merged into R9.) 

Register:! RE! RDi RCi RB i RA! R9l R8i R7i R6| R5i R4 i R3i R2i R1 i RO! 

I j ! ! I mX! mX! I ! ! 1 ! ! ! I ! 

Use:| SX| i - !BEX| lo| hi! i | | ! I I I I - i 

(Multiply BEX by 2.) 

(RB, RC)®RB*2 . (Two times BEX, 2BE, is now in RC.) 

Register:! REl RDI RCI RBI RA! R9l R8| R7l R6 ! R5l R4| R3l R2| R1 I R0| 

I ! I I I mX| mXl I I I I I I I I I 

Use:! SX| I2BEI - I lo! hi | ! I I I | I I I - I 

(The input angle value, IXl, is described in terms of units of 
radians. For computational convenience, the angle must be converted 
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to units of 1/4 rotations* This unit transformation is accomplished 
by multiplying ! Xl by (2/PI); the resultant value, iRI, expresses 
the input angle in terms of the 1/4 rotation units* The actual 
transformation is accomplished by developing the true mantissa of 
I times (2/PI); this result is called N m2p n . iRj is just "m2p" 
times 2 # *TEX where TEX is the true exponent of |X}» After the 
development of !Ri, it is oonverted to an integer form (0*2*30)* 

The fractional part of this Integer form, Rf (a single quadrant 
an le) , in conjunction with the 2 integer bits of integer iR{ 

(the quadrant index bits), will be used to find SIN( t X » • ) 

(Develop the X true mantissa times 2/PI* Note that in this section of 
the PDL, rnX is the biased mantissa; thus, to develop m2p, 
mX*(2/PI)+K is computed where K«.5 § (2/PI)»1/PI • (The mX bias is 
-.5 .) 

The radix point of mX is 1 bit position left of the left edge of R9. 
The value 2/PI is X* 1 45F306E • »(2»*(-29) ) . ) 

4*7 R5*R9 * (Replicate mX hi in R5.) 

(RA, RB)*RA*X* 145F' (card) * (Product of mX lo times 2/PI hi*) 

(R5, R6)*R5 # X'306E' (card) . (Product of mX hi times 2/PI lo.) 
R5sR5+RA, save carry . (Add for partial product lo*) 

(R9, RA)*R9»X*145F' (card) . (Product of mX hi times 2/PI hi.) 
R9*R9+carry in . (Add for partial product hi.) 

R5sR5+RA, save carry • (Add for full product lo.) 

R9=R9+carry in * (Add for full product hi. Product of mX # (2/PI) 
is now in (R9, R5) • The radix point of the 
product is 2 bit positions to the right of 
the left edge of R9» Now add (2/PI)/2 to 
account for the fact that mX is the true 
X mantissa biased down by .5 •) 

R5*R5+X , 306£’ , save carry . (Add 1/PI lo to lo part of product*) 
R9sR9+X' l45F’+carry in * (Add 1/PI hi to hi part of product. 


Product of X true mantissa times 
(2/PI) now is in (R9, R5). Call this 
result "aSP"*) 


Register: j 

j 

RE! RD! RC! RBI 

f ! f } 

HA! R9! R8! 

R7! R6! R5! 

! ! f 

R4 ! R3! R2! R1 ! R0! 

tiiii 

Use: ! 

ill) 

SX! ' 2BE | - ; 

t i i 

- !m2P| ! 

i t i 

1 - lm2P! 

t i t i i 

1 

* » i i 

t i t i 

I hi! 1 

! ! lo! 

ili! ; 


(The product of !X'*(2/PI) is called |R!. Now develop integer 
IR! from m2P and the exponent info of 2BE.) 

4*9 If RC(10)a0 (!Xi true exponent less than 16 test.) 

(True X exponent is less than 16*) 

4*9*0 Continue* 

Else 

(True X exponent is at least 16 (but less than EXmax).) 
4*9*1 R9&R5 
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R5«0 

End (4*9) If 

4.B RC*RC-(2 # 128-SHF) . ((True X exponent) 1 ^ +SHF is now in RC. 

RC*entry point to SHF table* SEE ATANF2 
FOR SHF TABLE DETAILS.) 

R8aO(RC) • (Content of SHF table now in R8.) 

(R5, R6)«R5»R8 (card). (Shift lo.) 

(R8, R9)*R8 B R9 (card). (Shift hi.) 

R9»R9 »0R* R5 * (Merge hi part of lo product into hi result.) 
RA«R6 • (Shifted product is in (R9, RA).) 

Register: i RE| RDl RCt RBI RAl R9l R8I R7I R6I R5l R4| R3 ! R2I R1 i ROl 

I I I I I inti inti I I I I I I I I I 

Use:! SXI I - I - 1IR| I IRI I - t I - I - I I I I I - ! 

I hil I I I I I ! ! I I 

(The lead 2 bits of integer !R| (R9(0) & R9(1)) are integer 
bits and define the quadrant in which the angle lies. The most 
significant 3 fractional bits of |R| (R9(2), R9(3)» A R9( 4) ) comprise 
the argument range index, ”J R ,epolynomial to be used in approximating 
the output Y. The remaining bits of (R9, RA) , appropriately aligned, 
are used as input into the polynomial routine.) 

(Fix output Y sign to account for quadrant; quadrant 2, 3 indicator 
is in R9( 0) .) 

4.D REsRE .EXCLUSIVE OR. R9 . (Sign of Y, SY, in RE(0); garbage 

elsewhere in RE.) 

RE*R£ .AND. X'&OOO' . (Sign of Y in RE(0); "0" elsewhere in 
RE.) 


Register: 

RE | 

RDl RC | 

RBI RA! R9t R8! R7I R6I 

R5! R4 

1 R3i R2i R1 

t 

» 

R0! 


* 

i 

1 1 

lint lint! 1 1 1 

i 

i 

I I I 

t 

1 

I 

1 

Use: 

SY I 

1 f 

* • 1 

» 1 nl 1 1 p 1 * 1 ' I 

-ttniiinti-i i-i 

_ * 

i t » 

i t i 

t 

1 

1 


1 

I 1 

I lo! hil I I I 

i 

i 

» » t 

i i i 

I 

* 

I 

1 

4.F 

If R9( 1 )*1 

(Odd quadrant test.) 







(Odd quadrant 1 or 3 involved. 

Since 

the angular 

range 


to be approximated is 0 to less than 1 quarter rotations, 
the angle value that is the fractional part of int |R| 
must be subtracted from 1 (PI/2) for these quadrants.) 

4.F.0 RAs.NOT. RA • (Subtract angle lo from 1 (unit* 1/4 

rotations) .) 

R9*«N0T. R9 . (Subtract angle hi from 1 (units 1/4 

rotations) • Radix point 2 bits right of 
left edge of R9.)) 

Else 

(Even quadrants 0 or 2 involved.) 

4.F.1 Continue. 

End (4.F) If 
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(The approximation subinterval index, "j", is defined by the bits 
R9(2), R9(3)» A R9(4) » The index "J" establishes the appropriate 
polynomial, "aj", to be used in finding !Y!.) 

(Isolate "J" and slide the lower bits (R9(5) thru RA( 15) ) to the 
left 5 bit positions so that they butt up against the left edge 
of R9. The result in (R9, RA) will be "U".) 

4.11 R4*R9 * (Replloate int !R| hi in R4.) 

(R4, R5)«R4*X'0020' • (Multiply int !R| hi in R4 by 32 in order 

isolate quadrant/" J" data in R4; the lower 
order bits remain, left justified, in 
R5. Thoy are all but the 3 top fractional 
bits of int !R! hi.) 

(RA, RB)*RA*X , 0020’ • (Multiply int |R| lo in RA by 32 in order 


Register: { 

i 

Use: ! 


to slide int ! R ' lo bits 5 bit positions 
to the left; lc bits reside in RB.) 

R5*R5 .OR. RA . (Merge Uhl bits in R5.) 

RAsRB • (Ulo moved to RA.) 

R9*R5 . (Uhi moved to R9.) 

RD*0 . (Clear sign bit of U. U is positive only; radix point 
at left edge of R9*) 

(Create 4*j to be able to access polynomial 
R4*SHIFT(R4, + 2 ) . (Shifc value in R4 left 

R4.R4 .AND. X'OOIC* .(Mask out sll but 4*J bits of R4; clear all 

but bits 11, ..*,13 •) 


info GET table.) 

2 bit positions.) 


RE! 


RD{ RCi RB| RA! R9! R8| R7! R6! R5 

01 I I l I I I I 

SUI - ! - | U { U ! - ! 


hi! 


- !4*j 


R4 ! R3 


R2| R1 


RO! 


t 

- I 


(R5 is used to get access to the appropriate polynomial, si(U), for 
approximating ( (SIN(Rfm*(PI/2) ) )/2) * 

Note that Rfta*(J/8)+(U/8) and that 0 <= U < 1 .) 

(The use of 4*j to retrieve polynomial parameters follows.) 

4.13 R5*R5+GET . (R5 points to polynomial degree number 

in "GET" table.) 





"GET" Table 


J 

k 

4*J+k 

Address 

Value 

c 

0 

4*0+0 

GET+ 0 

M(0) 

0 

2 

4*0+2 

GET+ 2 

K0EF( 0)+4*M{ 0 ) sr2 

1 

0 

4*1+0 

GET+ 4 

M(1) 

1 

2 

4*1+2 

GET+ 6 

K0EF( 1)+4*M( 1 )+^ 

2 

0 

4*2+0 

GET+ 8 

M(2) 

2 

2 

4*2+2 

GET+10 

K0EF(2)+4*M(2)+2 

3 

0 

4*?+0 

GET+12 

M(3) 
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3 

2 

4*3+2 

GET+14 

KOEF(3)+4*M(3)+2 

4 

0 

4*4+0 

GET+16 

M(4) 

4 

2 

4*4+2 

GET+18 

K0EF(4)+4*M(4)+2 

5 

0 

4*5+0 

GET+20 

M( 5) 

5 

2 

4*5+2 

GET+22 

K0EF(5)+4*M(5)+2 

6 

0 

4*6+0 

GET+24 

M(6) 

6 

2 

4*6+2 

GET+26 

KOEF(6)+4*M(6)+2 

7 

0 

4*7+0 

GET+28 

M(7) 

i 

2 

4*7+2 

GET+30 

K0EF(7)+4*M(7)+2 


R7=0(R5); R5=R5+2 » (Load R7 with degree of selected 

polynomial, M(J), held at address 
P'inted to by value in R5* Then 
bump "GET" table pointer, R5.) 
R8sO(R5) . (Load R8 with K0EF( j)+4*M( j)+2 ; the last 


1b oit address of the KOEF(j) block, the 
coefficients of the polynomial, r(j), held 
at address pointed to by value in R5*) 


Register: ! 

RE! 

i 

RD! 
fi ! 

RC! RB! RA 

i i 

R9 

R8 

R7! R6! R5! R4{ R3! R2 ! Rl! RO 

i j i t i t t 

Use: j 

SY! 

w 1 

su! 

i i 

- ! - !Ulo 

Uhi 

loc 

i i i i i i i 

ent! - ! - ! ! ! ! ! - 


4.15 BAL,RF POLY32. (Compute "r(j) w polynomial.) 

Register:! RE! RD! RCi RB! RA! R9i R8 ! R7! R6 ! R5! R4! R3! R2 ! R1 ! RO! 

!!!!!!!!!!!!! sj! ! sj! 
Use:! SY! - ! - ! - ! - ! - i - ! - ! - ! - ! - { - ! hi! - ! lo! 


(The value of sj(U) polynomial approximates 


(SIN(ARG) )/2 where 

ARGs( j/8+U/8)*(PI/2) , j=0,...,7, 4 

0 <s U < 1 and 

-.5 <= sj <= .5 .) 



4.17 

4.17.0 

Register: ! 

! 

Use: ! 


(If R2(0) bit is 1 , the fine. 1 t Y! true exponent must be 1. If 
R2(0)is 0, exponents will be less than 1. 

If R2(0) S 1 ( ! Y!=1 test.) 

(iYjsl branch direction. ( SIN (Rfm* (PI/2) ) )/2 is equal to 
*5 ; argument of SIN function is 90 degrees.) 

R2=X , 4080 f . ( } Y! hi in R2.) 

ROsX'OOOO' . ( ! Y! lo in RO.) 

R2=R2 .OR. PE » (Merge sign bit into ! Y ! ; Y results.) 
REsO » (Status bit.) 

RETURN via RF . 


RE! RD! RC 

0 ! ! 


RB! RA! R9! R8! R7 ! R6 ! 
!!!}!! 


R5! R4 

t 

i 


R3! R2! Rl! RO! 

i i t t 


si-! 


! Yhi ! - I Ylo! 
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Else 

( 1 Yi < 1 branch direction* (SIN(Rfm*(PI/2) ) )/2 is less 
than *5 ; argument of SIN function is less than 90 

degrees * ) 

4.17*1 Continue » 

End (4.17) If 
4.19 If R2( 1)=0 

(|Y| < .5 branch direction. (SIN(Rfm*(PI/2) ) )/2 is less 
than .25 ; argument of SIN function is less than 30 

degrees.) 

4.19.0 Continue . 

Else 

(.5 <= |Yl < 1 branch direction. (SIN(Rfm*(PI/2) ) )/2 
is less than .5 but greater than or equal to .25; 
argument of SIN function is less than 90 degrees but 
greater than or equal to 30 degrees.) 

4.19*1 (R2, R3)*R2*X , 0200 f . (Shift ! Y i mantissa kernal hi 

left 9 bit places.) 

(R0, R1)sRO*X , 0200 l . (Shift !Y| mantissa kernal lo 

left 9 bit places.) 

ROsRO .OR. R3 * (Merge true mantissa lo parts.) 

R2=R2+(( 128-1) *128) . (Establish ? Y I biased exponent.) 

R2sR2 .OR. RE . (Merge sign bit into iYi; Y results.) 
RErO . (Status bitsS.) 

RETURN via RF . 


Register: 

! RE! RD? RC! RB! RA', R9 ! R8 ! R7i 

1 n t » « 1 1 1 f I 

1 U 1 1 1 1 ! t 1 1 

R6! R5I R4! R3! R2i 

9 t 1 I f 

1 1 1 1 I 

Rl! R0! 

1 f 

1 1 

Use: 

i o i * » i _ i _ ’ _ * _ ! 

ioi — 

1 Yhi ! 

- lYloi 


End (4.19) If 



4.21 

If R2 (2)=0 




( !Y| < *25 branch direction. (SIN(Rfm*(PI/2) ))/2 is less 
than .125 ; argument of SIN function is less than 

14.47751219 degrees.) 

4.21.0 Continue . 

Else 

(.25 <= ! v ! < *5 branch direction. (SIN(Rfm*(PI/2) ) )/2 
is less than .25 but greater than or equal to .125; 
argument of SIN function is less than 30 degrees but 
greater than or equal to 14.47751219 degrees.) 

4.21.1 (R2, R3)sR2*X'0400' . (Shift |Y| mantissa kernal hi 
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left 10 bit places *) 

(RO, R1 )aRO*X , 0400' . (Shift |Yl mantissa kernel lo 

left 10 bit places*) 

ROsRO *OR* R3 * (Merge true mantissa lo parts*) 
R2 sR 2+((128-2)«128) . (Establish |Y| biased exponent.) 

R2aR2 .OR. RE • (Merge sign bit into I Y I ; Y results.) 
REsO » (Status bit=S.) 

RETURN via RF . 

Register:! RE! RD! RCJ RB! RA! R9 ! R8! R7! R6 i R5! R4 ! R3! R2 ! R1 ! RO! 

! 0 i i i S i I ! I ! i ! ! ! i i 

Use:} S I - I - I - I - ! - I - ! - I - I - ! - ! - ! Yhil - ! Ylo! 


End (4.21) If 
4.23 If R2(3)*0 

(!Y! < .125 branch direction. (SIN(Rfm*(PI/2) ) )/2 is less 
than *0625 ; argument of SIN function is less than 

7.180755781 degrees.) 

4.23*0 Continue . 


4.23.1 


Register: 


Use: 


RE 

0 

S 


Else 

(.125 <* I Y! < .25 branch direction. (SIN(Rftn»(PI/2) ))/2 
is less than .125 but greater than or equal to .0625,* 
argument of SIN function is less than 14.47751219 degrees 
but greater than or equal to 7.180755781 degrees.) 

(R2, R3)=R2*X , 0800‘ . (Shift ! Y| mantissa kernal hi 

left 11 bit places.) 

)sRO*X , 0800' . (Shift |Yi mantissa kernal lo 

left 11 bit places.) 

ROaRO .OR. R3 * (Merge true mantissa lo parts.) 

R2sR2+( ( 128-3) *128) . (Establish !Y| 

R2sR2 »0R. RE • (Merge sign bit Into 
RE=0 • (Status bit=S.) 

RETURN via RF . 


biased exponent.) 
1 Y ! ; Y results.) 


RD| RC 


RB! RA 


R9I R8{ R7I R6i R5 


R4 ! R3 


R2i R1 

i 

R0 

Yhi ! - 

Ylo 


End (4.23) If 


(The 4 lead bits of "sj\ R2(0), R2(1), R2(2), 4 R2(3)i, 
are n 0"; the most likely result cases have now been examined. 
Now take care of the cases in which n sj M is very small; 
i.e*, (SIN(Rfm*(PI/2) ))/2 is less than .0625 or argument 
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of SIN function ia leas than 7*180755781 degrees.) 

(First, define a new constant that eliminates the lead 4 
zero bits of "sj".) 

4.25 (R2, R3)*R2»X'0010» . (Shift 1Y| mantissa kernal hi left 4 

bit places.) 

(RO, R1 )sR0*X'0010' . (Shift |Y| mantissa kernal lo left 4 

bit places.) 

R3=R3 .OR. RO . (Merge 4 bit left-shifted (SIN(Rfm*(PI/2) ) )/2 
hi parts; radix point for the above is 3 bit 
intervals left of the left edge of R3» Note 
that R3(0)s0. The lo part of this value lies 
in R1. Call this value n lsr n and consider its 
radix point to lie at the left edge of R3* 




Then, 

"lsr 

"s8*(SIN(Rfra*(PI/2) ) ) .) 

Register: 

RE! RDi RC! 

RB! RA! R9i 

R8! 

R7i R6! R5! R4 ! R3! R2j Rli R0 


1 ! i 

! » 


lsr! ilsr! 

Use: 

i 

i 

CO 

- 1 - 1 - 

i 

- i 

< 1 * 1 UJ 1 1 1-1 

- i - i - i - » nil i io, - 


4.27 If R3=0 

(The new variable "lsr" is smaller than 2**(-l6) .) 

Shift "lsr" value left by 16 bit positions.) 

4.27.0 R3=R1 . (Shift "lsr" lo into "lsr" hi.) 

R1=0 . (Shift "0" into "lsr" lo.) 

RAs( 128-1— 3-l6)*2 • ( 2* ( I Y { biased exponent kernal) in RA.) 

Else 

4.27.1 RAs( 128-1-3)*2 . ( ( 2* ( ! Y ! biased exponent kernal) in RA.) 

End (4.27) If 

4.29 If R5 < X'FFOO' • (Test mask for R3 lead bits, 8 bits wide.) 

(The variable in (R3, R1) is smaller than 2 ** (-8) .) 

Shift value left by 8 bit positions.) 

4.29.0 (R3, R4)sR3«X'0100' . (Shift hi left 8 places.) 

R3=R4 . (Put shifted value back into R3.) 

RCs8*2 . (Shift index is 8.) 

Else 

4.29.1 RCs0*2 . (Shift index is 0.) 

End (4.29) If 

4.2B If R5 < X’FOOO* * (Test mask for R3 lead bits, 4 bits wide.) 

(The variable in (R3, R1) is smaller than 2**(-4) .) 

Shift value left by 4 bit positions.) 

4.2B.0 (R3, R4)=R3*X'0010' . (Shift hi left 4 places.) 

R3=R4 . (Put shifted value back into R3») 

RC=RC+4*2 . (Shift index delta is 4.) 

Else 

4.2B.1 Continue . 

End (4.2B) If 
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4. 2D If R5 < X'COOO* » (Test mask for R3 lead bits, 2 bits wide*) 
(The variable in (R3» R1) is smaller than 2** (-2) .) 

Shift value left by 2 bit positions *) 

4.2D.0 (R3, R4)aR3»X’0004' . (Shift hi left 2 places.) 

R3sR4 » (Put shifted value back into R3») 

RC=RC+2 # 2 . (Shift index delta is 2.) 

Else 

4.2D.1 Continue » 

End (4. 2D) If 

4.2F If R5 < X'8000' » (Test mask for R3 lead bits, 1 bits wide.) 

(The variable in (R3, R 1 ) is smaller than 2** ( — 1 ) .) 

Shift value left by 1 bit position.) 

4.2F.0 (R3, R4)=R3*X , 0002* . (Shift hi left 1 places.) 

R3*R4 . (Put shifted value back into R3 • ) 

RC=RC+1*2 . (Shift index delta is 1.) 

Else 

4.2F.1 Continue . 

End f U #2F ) If 

4.31 If R5=0 

(The variable in (R3, R1) is 0.) 

4.31.0 R2=0 . ( i Y|sO; R2=Yhi.) 

ROsO . ( { Y| =0 ; R2rYlo.) 

REsO • (StatussS.) 

RETURN via RF . 

Register:! RE| RD { RCi RB } RA i R9 ! R8! R7l R6 ! R5! R4 { R3! R2i R1 ! RO i 

» a i i i i i i i i i i i i a i in 1 

i U i i i } i i i t i i i jUi iUt 

Use:! S ! - ! - i - ! - ! - 1 - ! - i - 1 - ! - ! - |Yhi! - J Ylo! 


Else 

(The variable in (R3» R1) is at least X'SOOOOOOO* .) 

4*31.1 RA=RA-RC * (Unaligned biased { Y S exponent-1 is in RA.) 

(RA, RB)sRA*64 . ((Biased |Y! exponent-1 ) *128 in RB; 
aligned data.) 

RC=RC .OR* X’OOIF’ » (Modulo 32 mask shift index times 2.) 
RCsRC+SHF . (Entry to shift table value.) 

R6sO(RC) . (Put shift value into R6.) 

(R1, R2)=R1*R6 * (Shifted & aligned lo part of (R3, R1)») 
(R3, R4)=R3*X , 0100' . (Aligned hi part of (R3, R1) is in 

R3; !Yi hi biased, aligned 
mantissa.) 

RIsRI .OR. R4 * (Merge ! Y ! mantissa lo pieces; Ylo 
results.) 

R3=R3+RB . (Add aligned !Yi exponent data (short by 1) 
to aligned mantissa data; lead mantissa bit 
fixes exponent data.) 
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R3=R3 .OR. RE » ( | Yi hi merged with sign of Y yields 

Y hi.) 

R2*R3 . (Put Yhi into R2.) 

ROsRI . (Put Ylo into RO.) 

REsO • (Status=S.) 

RETURN via RF . 

Register:! RE! RD ! RC{ RB! RA| R9 ! R8! R7{ R6| R5i R4! R3 ! R2i R1 ! ROi 

! 0 i !!!!!!!!!!!!! | 

Use:! Si-| - !-!-!-! - ,'-!-!-!-! - } Yhi! - !Ylo{ 


End (4.31) If 
End (4) If 
END 
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3*2.3 MCU SUBROUTINE DESCRIPTION : COSM 


This Subroutine develops the value, "Y", the cosine of the input variable, 
"X". "X", the input, and "Y", the output, are 32 bit VAX floating point 
numbers* Along with "Y", a 16 bit status value, S, is generated for output; 
it is 0 when |X| is less than 2**24. When jXi is equal to or greater than 
2**24, the angular uncertainty is on the order of 2 radians and so the Y 
answer becomes meaningless; the status is set to 3 in such case (no processing 
is performed). 

The w Y n value range is -1 <= Y <= +1 • 


The subroutine demands the calculation of 

YsCC3(X)=( (-1 )**SX)*C0S( iXi ) where 

SX is the value of the sign bit of X, and 
|Xi symbolizes the absolute value of X* 

!Y| is computed using the SINF subroutine after PI/2 has been added to ! X i 
and SX has been negated. 

For all jXi values less than 2**24, |Y| is computed only after 
converting X to units of 1/4 rotations and then converting this resultant 
value, R, to an integer form* After negating SX and executing R=1+R, the 
new R takes on the role of R of the SINF routine* The SINF code is then used 
to complete the processing. 


The entry branch and link register for this subroutine is RF* The subroutine 
calls the POLY32 subroutine by way of register RF* The POLY32 subroutine , in 
turn, calls the subroutine, MULT32*MS, as an internal subroutine (i*e*, no BAL 
register is used) * 
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Registers directly required by this subroutine are marked with a 
Registers indirectly required by the POLY32 routine are marked with a '•#"* 
Registers indirectly required by the MULT32*MS routine are marked with a . 


Register: i 
i 

REj RDi 

i j 

RCI RB| RAj 

R9! 

i 

R8i 

i 

R7! 

i 

R6i R5l 

i t 

R4| 

i 

R3! 

1 

R2I 

i 

R1 ! 
1 

RO! 

COSF Use:! 

< i 

• l • 1 

i i i 

*!*!*! 

i 

• • 

i 

» ; 

i 

« j 

i i 

• ! ■ ! 

i 

1 

1 

• ! 

i 

• 1 

1 

• j 

» j 

P0LY32 Use:! 

1 # ! 

! 1 # ! 

# I 

# I 

# 1 

# i i 

i 

t 

1 

# ! 

1 

1 

# i 

MULT32.MS Use:! 

i $ ! 

1 i $ 1 

$ i 

! 

1 

1 

i i 

i i 

$ ! 

$ ! 

$ I 

$ ! 

$ ! 


ON ENTRY: 
R9=Xlo 
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COSM entry 


1 • COSM entry » 

Register:! RE! RD! RC! RBI RA| R9 ! R8 i R7! R6 ! R5! R4| R3! R2! R1 ! RO 

I ! ! ! I I I i i I ! ! ! ! I 

Use:! ! i iXhij ! Xlo! !!!!!!!! 


REsX'8000' 


(Prepare to capture X sign in REC 0 ) ; clear other bits*! 


REsRE *AND. RB . (Sign of X in RE; other RE bits cleared.) 


RBsRB .EXCLUSIVE OR. RE 
Register:! RE) RD I RCi RB! RAi R9! 


(X changed to a magnitude, !X!.) 


R8! 


R7 ! 


R6I R51 R4 ! R3! R2 


l l y I l 
l l A ii 


Use:! 


hi! 


lo! 


Rli RO 


3 


3.0 


Register: 

Use: 


If RB => ( 128+EXmax)*128 (EXmax=25* i X i greater than or equal to 

2**24 test.) 

({X! s> 2**24; too much angular uncertainty results. Return 
with status value of 3.) 

REs3 * (StatussSs3. Worthless results exist in Yhi and Ylo 
registers.) 


RE! 


RETURN 
RD! RC 


via 

RB 


RF 

RA 


R9i 


R8 


R7 ! 


R6! R5! R4 


R3! 


P2 


R1 


RO! 


i 

t 


Else 

( ! X ! < 2**24 along this path.) 

3.1 Continue. 

End (3) If 

4 If RB s> ( 1 28—1 2) * 1 28 ( ! X| greater than or equal to 2*»(-13) test.) 

(Y not equal to 1.) 

4*0 Continue. 

Else 

(Y = 1.) 
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4.1 R2«X'4080» . (Y hi.) 

ROsX'OOOO’ . (Y lo.) 

REsO . (StatussS.) 

RETURN via RF . 

Register: i REi RDi RCl RB{ RA t R9t R8 ( R7l R6i R5l R4I R3I R2i R1 1 ROi 

I 0 I ! i I I i I ! ! I i ) I i i 

Use: i S ! I I i I I S i I i I lYhil |Ylo| 


End (4) If 

(IXI must be converted to 1/4 rotation angular units; then 
the result will be converted to integer form* The fractional 
part of this integer form will be used to find SINCiXi.) 

(Isolate I X I biased exponent (BEX) and biased mantissa (mx)») 

5 (RB, RC)sRB*X'0200' . (jX! biased mantissa hi left Justified in RC; 

i X j biased exponent, BEX, is in RB, right 
justified.) 

(R9» RA)sR9 # X , 0200' . ( I X S biased mantissa lo left justified in 

(R9, RA).) 

R9sR9 .OR. RC • (jxl biased mantissa hi merged into R9») 

Register:! RES RDi RCl RB 1 RAi R9i R8j R7i R61 RM R41 R3l R2 1 R1 { ROl 

1111! mX 1 mX 1 I 1 1 1 1 1 1 1 1 

Use:) SXl 1 IBEX! loi hi I 1 1 I 1 1 1 1 i 1 

(Multiply BEX by 2.) 

6 (RB, RC)sRB*2 . (Two times BEX, 2BE, is now in RC.) 

Register:', RE! RDi RCl RB! RAl R9i R81 R7! R6! R5l R41 R3i R2l Rli ROl 

1 1 1 I 1 mX 1 mX 1 I 1 j 1 1 1 1 1 1 

Use: | SXl |2BE| - 1 lot hi! !!!!!!!! ! 

(Multiply the X true mantissa times 2/PI. Note that mX is 
the biased mantissa and that the radix point of mX is 
1 bit position left of the left edge of R9. The value 2/PI 
is X'145F306E'*(2«»(-13)).) 

7 R5sR9 ♦ (Replicate mX hi in R5.) 

(RA, RB)sRB*X' 145F' (card) . (Product of mX lo times 2/P.I hi.) 

(R5 , R6)=R5*X'306E» (card) . (Product of mX hi times 2/PI lo.) 
R5=R5+R*. save carry • (Add for partial product lo.) 

(R9 » RA) sR9*X' 1 45F * (card) . (Product of mX hi times 2/PI hi.) 
R9*R9+carry in . (Add for partial product hi.) 

R5sR5+RA, save carry • (Add for full product lo.) 

R9sR9+carry in • (Add for full product hi. Product of mX*(2/PI) 
is now in (R9; R5). The radix point of the 
product is 2 bit positions to the right of 
the left edge of R9. Now add (2/PI)/2 to 
account for the fact that mX is the true 
X mantissa biased down by .5 .) 
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RSsRS^X'SOSE 1 , save carry . (Add 1/PI Xo to lo part of product.) 
R9»R9vX' 145F' scarry in . (Add 1/PI hi to hi part of product. 

Produot of X true mantissa times 
/PI) now is in (R9, R5). Call this 
result M m2P n . Radix point of "m2P w is 
between R9(1) and R9(2)») 

Register:! REi RD| RC! RBj RA| R9i R8 t R7! R6j R5 ! R4 ! R3I R2| R1 ! RO 

I I i I rs2P i I I Id2P| I | I I I 

Use:! SXS I2BE! - I - ! hi! I ! - 1 lo! I I ! I i 

(The product of |X|*(2/PI) is called |R|* Now develop integar 
iRi from m2P and the exponent info of 2BE.) 

8 If RC < 128*2 (Negative IX! exponent test.) 

(IX! true exponent is negative.) 

(Consider that m2P has been multiplied by 2**(-l6) and exists 
in (RA, R9 r R5) where the radix point is between RA(1) and 
RA(2)» To make m2P an integer, the number in the three register 
set must be shifted left by an amount J * } X } true exponent +16. 
The value SHF+2*j points to the correct shift value to be 
pulled from the SHF table.) 

(Proceed to develop m2P as an integer.) 

8.0 RCsRC-( ( 128-1 6)*2-SHF) • (RC points to shift value.) 

R8s0(RC) . (Shift constant into R8.) 

(R9, RA)*R9*R8 (card) • (Output hi*) 

(&5, R6)sR 5»R8 (card) • (Output lo.) 

RAsRA .OR* R5 * (Merge lo parts.) 

(Note that (R9, RA, R6) took on the role of the starting 
3 registers, (RA, R9» R5) , in the 3 preceding steps.) 

(Add 1 (1/4 rotation) to the value in ( R9 » RA, R6). The 
radix point of this value lies between R9(1) and R9(2)») 

8.2 R9*R9+X , 4000' • (Quarter rotation has been added. Call result 

int R.) 

Register:! RE| RDi RC! RB| RA! R9i R8I R7i R6I R5i R4j R3! R2! R1 ! RO! 

! ! ! ! I int I int | !!!!!!!!! 

Use:! SR! I - ! - I lo! hi! - ! ! - ! - ! ! ! i ! 1 

8.4 BAI.,RF SINF3., 4.D . 

RETURN via RF . 

Register:! RE{ RDI RCi RBj RA! R9l RSI R7i R6| R5i R4j R3i R2i R1 i R0| 

I 0 i ! j ! ! i $ I | I i ! ! | ! 

Use:! Si-l-1- !-!-!-!-!-!-!-!- iYhil - ! Ylol 
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<|X I true exponent la 0 or positive.) 

Register:) RE) RD| RCl RBI RAj R9l R8| R7l R6j R5I R4| R3I H2 ! Rll ROi 

! I I I I Id2P) I I |d 2P) I i 1 I I 

Use:) SX| |2BE| - I - I hi! I I - I lol | ! I I I 

8*1 If RC(10)*0 (Exponent less than 16 test.) 

(True X exponent is less than 16.) 

8.1.0 Continue. 

Else 

(True X exponent is at least 16 (but less than EXoax).) 

8.1.1 R9*R5 

R5*0 

End (8.1) If 


8.3 RCsRC-( 2*128-SHF) . ((True X exponent)*2 +SHF is now in RC. 

RC*entry point to SHF table. SEE ATANF2 
FOR SHF TABLE DETAILS.) 

R8*0(RC) • (Content of SHF table now in R8.) 

(R5, R6)*R5»R8 (card). (Shift lo.) 

(R8, R9)sR 8»R9 (card). (Shift hi.) 

R9*R9 .OR. R5. (Merge hi part of lo product into hi result.) 
RA*R6 • (Shifted product is in (R9, RA).) 


8.5 


R9*R9+X'4000' • (Quarter rotation has been added. 


Register:) RE) RDi RC) 


Use:) SXI 


int 

RB) RA) R9l R8 
! int ) int ) 

- IIRIIIR!! - 
I lo! hi! 


R.) 

! P-71 


R6) R5i RU) R3 1 R2) R1 


-II! 


Call result 
I R0! 


I 


8.7 BAL,RF SINF3* , 4.D ♦ 

RETURN via RF . 

Register:) RE) RD) RC) RB) RA) R9) R8! R7I R6| R5) R4| R3! R2) R1) R0) 

I 0 ! I ! I I I 1 ! ) ! i ) S ) ! 

Use:) S | • I - I - | - |-|-|-|-)-|-|- tYhi) - lYlo) 


End (8) If 
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3.2.4 MCU SUBROUTINE DESCRIPTION : ATANM 


This Subroutine develops the value, n Y M , the arctangent of the input variable, 
"X". "X*, the input, and "Y", the output, are 32 bit VAX floating point 

numbers. Along with "Y", a 16 bit status value, S, is generated for output; 
it should always be 0* The "Y" value range is (- PI/2) <» Y <a (PI/2) • 


The subroutine demands the calculation of 

Y»ATAN(X)*( (-1 )**SX)*ATAN( |Xl) where 

3X is the value of the sign bit of X, and 
j X ! symbolizes the absolute value of X. 

i Y I oust be computed usi.;g a number of different approximations* For true X 
exponents (TEX) less than 2, the IXi interval is dissected into 5 different 
sub-intervals, namely, 



1) 

TEX < 

(-11) 

or 


}Xi 

< 2**(-12) 

9 


2) 

(-11) <* TEX < 

( -5) 

or 

2** (—12) <* 

!xl 

< 2«*( -6) 

9 


3) 

( -5) <* TEX < 

( 0) 

or 

2**( -6) <= 

ixi 

< 2«»( -1) 

9 


4) 

TEX r 

( 0) 

or 

2* # ( -1) <* 

1 v 1 
1 A | 

< 2*»( 0) 

9 

and 

5) 

TEX s 

( 1) 

or 

2**( 0) <= 

1 V 1 

1 A 1 

< 2*»( 1) 

♦ 

For 

true X exponents greater 

■ than or 

equal 

to 2, ill is computed using 



SYi *(PI/2)-ATAN( 1/lXl ) • To find i Y t in such case, 1/jXi is first computed* 

Then the angle approximation associated with one of the 5 ! X S sub-intervals 
shown above is used to compute ATAN( 1/ 1 X j ) . Finally, this result is subtracted 
from PI/2 • 

The specific approximations for the 5 • X 1 sub-intervals shown above are listed 
below* The 8 approximations used to approximate the reciprocal are then listed* 


APPROXIMATIONS 


Sub-interval 1) 


When the true exponent of X is less than -11, 
lYMXl. 
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Sub-interval 2) 

When the true exponent of X is less than -5 but greater than or equal 
to -11, |Y| is given by 

|Y|*lXl-(|Xl»«3)/3 and so by 

IYI*|X|«(1-(1/3)»X2) where 
X2«|X|*«2. 

POLY32, the polynomial expansion routine, is not used to compute this 
approximation* 


Sub-interval 3) 

ee «• m w 

When the true exponent of X is less than 0 but greater than or equal 
to -5, IY! is found using 

j Y } * | X | * ( 1 — G (IX! )) where 
G( |X| )* 1-(ATAN( | X | ) ) / S X I 

The value of G(|Xt) is established using the polynomial, "pl(U)", given by 

pi (U)*A10*(U* # 0)+A1 1*(U** 1 )4-A12 # (U**2)+A1 3*(U # *3)+* • »* *+A1N1 # (U t# N1 ) where 
U«|X| , 

the i X i range is 1/64 <* i X I < *5 , 

the U range is 1/64 <* U < *5 , and 

pi approximates G( ! X i > • 

The polynomial "pi* is computed from right to left using 

pi ( U ) * A 1 0+U # (All ♦U* ( A 1 2+U* ( A 1 3+U* ( A 1 4+U # ( A 1 5+U* ( A 1 6+ +U»(A1N) ♦ 

The POLY32 routine used to oompute pi assumes that 1/64 <a U < 1/2 and 
"U" has the signed magnitude format [S, (0*31 *0) 3*2 #i (-32) and that 
pi lies within the range -1/4 <m pi < 1/4 (it does) and has the 2*s corapleme 
format ( 1*31.0)»2»«(-32) . 

The starting location of the memory space that stores the "A1" coefficients 
needed to compute pi is C0EF1* The coefficient data are assumed stored in the 
sequence: 

Address Item 

C0EF1+ 0 A10(hi) 

COEFW 2 A10(lo) 
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COEFU 

4 

A11(hi) 

COEFU 

6 

All(lo) 

C0EF1+ 

8 

A12(hi) 

COEFU 

10 

A12(lo) 

COEFU 

12 

A13(hi) 

COEFU 

14 

A13(lo) 



COEF1+4»N1 AIN(hi) 

COEFU4«NU2 AIN(lo) * 

N1 , the degree of the "pi" polynomial, and the COEF1 coefficient data are 
defined within this subroutine* 

Once pi is computed, the output |Yi value is given by 
!Y!=iX}»(1-p1) . 


Sub-interval 4) 

When the true exponent of X is 0, S Y i is found using 

I Y}=p2(U)+*5 where 
U=2*|Xi-1 *5 , 

The polynomial, n p2(U)", is defined by 

p2=A20*(U**0)+A21*(U**1 )+A22*(U**2)+A23*(U**3)+ +A2N1»(U*»N1 ) where 

the iX} range is 1/2 <r I X i < 1*0 , 

the U range is -*5 <= U < *5 , and 

p2(U) approximates *5 § (ATAN( (U+1 *5)/2) )-*5 * 

The polynomial n p2" is computed from right to left using 

p2 ( U ) = A20+U* ( A2 1 +U« ( A22+U* ( A2 3+U« ( A2 4+U» ( A25+U* ( A2 6+ +U # ( A2N) . 

The POLY32 routine used to compute p2 assumes that -1/2 <s U < 1/2 and 
"U" has the signed magnitude format [S, (0*31 *0) ]*2**(-32) and that 

p2 lies in the range -1/4 <« p2 < 1/4 (it does) and has the 2's complement 
format ( 1*31*0)*2»*(-32) • 

The starting location of the memory space that stores the "A2 n coefficients 
needed to compute p2 is C0EF2* The coefficient data are assumed stored in 
the sequence: 

Address Item 
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C0EF2+ 

0 

A20(hi) 

C0EF2+ 

2 

A20(lo) 

C0EF2+ 

4 

A2 1 (hi ) 

C0EF2+ 

6 

A21 (lo) 

C0EF2+ 

8 

A22(hi) 

C0EF2+ 

10 

A22(lo) 

C0EF2+ 

12 

A23(hi) 

C0EF2+ 

14 

A23(lo) 
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* 

C0EF2+4*N2 

C0EF2+4*N2+2 


A2N(hi) 
A2N(lo) * 


N2 , the degree of the "p2 w polynomial, and the C0EF2 coefficient data are 
defined within this subroutine* 


Once p2 is computed, the output ! Y ! value is given by 
! Y{=2*(p2+*5) * 


Sub-interval 5) 

When the true exponent of X is 1, * Y I is found using 

! Yl=2*(p3(U)+*5) where 
U=!X!-1*5 , 

The polynomial, "p3(U)"» is defined by 

P3(U)=A30»(U»»0)+A31*(U«»1)+A32*(U»»2)+A33 # (U»«3)+.**..+A3N1»(U*»N1) where 
the | X I range is 1.0 <* |X| < 2.0 , 

the U range is -.5 <= U < .5 , and 

p3(U) approximates .5 # (ATAN(U+1 .5)-. 5 * 

The polynomial n p3" is computed from right to left using 

p3=A30+U»(A31+U»(A32+U»(A33+U»(A34+U»(A35+U«(A36+ +U*(A3N) . 

The POLY32 routine used to compute p3 assumes that -1/2 <s U < 1/2 and 
"U" has the signed magnitude format [S, (0*31.O)]*2 # *(-32) and that 
p3 lies in the range -1/4 <= p3 < 1/4 (it does) and has the 2’s complement 
format (1.31 .0)*2»»(-32) . 

The starting location of the memory space that stores the "A3 n coefficients 
needed to compute p3 is C0EF3* The coefficient data are assumed stored in 
the sequence: 
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Address 


Item 

C0EF3+ 

0 

A30(hi) 

C0EF3+ 

2 

A30(lo) 

C0EF3+ 

4 

A3Khi) 

C0EF3+ 

6 

A31(lo) 

C0EF3+ 

8 

A32(hi) 

C0EF3+ 

10 

A32(lo) 

C0EF3+ 

12 

A33(hi) 

C0EF3+ 

14 

A33(lo) 


ORIGINAL PAGE IS 
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* * 

COEF3+4«N3 A3N(hi) 
COEF3+4*N3+2 A3N(1 o) . 


N3, the degree of the w p3" polynomial, and the C0EF3 coefficient data are 
defined within this subroutine* 

Once p3 is computed, the output * Y , value is given by 
!Y'.=2«(p3+.5) * 


Reciprocal 

When the true exponent of X is 2 or greater, |Yl is found by first computing 
1/iXi* The reciprocal of |Xl, |R|, is defined by 

i R I = ( 1 / ( 4*mX ) ) • ( 2*« ( -TEX+2 ) ) where 

mX here is the true mantissa of X, and 
TEX is the true exponent of X* 

The term 1/(4*mX) is approximated using 8 different polynomials* Each 
polynomial corresponds to 1 of 8 equal sub-intervals into which the raX range 
.5 <= mX < 1*0 , is divided* The "j"th (jsO, 1,*.*, 7) sub-interval 

approximation of (1/(4*oX)) is described by the polynomial rj(U) where 
U is related to mX using 

mXs*5+j/l6+(U*2) where 
0 <= U < 1/32 * 

The polynomial, n rj(U) n , is defined by 

rj(U)sBJ0*(U**0)+Bj 1*(U**1 )+Bj2 # (U**2)-4-Bj3*(U**3)+*****+BjMj*(U #i MJ) where 
the mX range is *5 <= |X| <1*0 , 
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the U range is 0 <s U < 1/32 , and 

rj(U) approximates (1/(4*mX)) • 

The polynomial "rj" is computed from right to left using 

r j sBjO*U* ( B J 1 *U # (Bj2*U* (B j3+U* ( Bj4+U* ( BJ5+U* (BJ6+**** **U« (B JM j ) * 

The P0LY32 routine used to compute rj assumes that 0 <* U < 1/32 and 
"U" has the signed magnitude format [S, (0*31*0) ] *2** (-32) and that 

rj lies in the range 0 <* rj <= 1/2 (it does) and has the 2’s complement 
format (1*31*0)»2”(-32) . 

The starting location of the memory space that stores the w Bj" coefficients 
needed to compute rj is KOEFj* The coefficient data are assumed stored in 
the sequence: 

Address Item 

KOEFj* 0 BjO(hi) 

KOEFj* 2 BjO(lo) 

KOEFj* 4 Bjl(hi) 

KOEFj* 6 Bjl(lo) 

KOEFj* 8 Bj2(hi) 

KOEFj* 10 Bj2(lo) 

KOEFj* 12 Bj3(hi) 

KOEFj* 14 Bj3(lo) 

♦ » 

♦ » 

• * 

K0EFj+4*N3 BjN(hi) 

K0EFj*4«N3+2 BjN(lo) . 

Mj, the degree of the "rj" polynomial, and the KOEFj coefficient da*-- are 
defined within this subroutine* 

Once rj is computed, the output 1 / ! X j value is given by 
1/! Xi srj*(2 fB (-TEX+2) ) * 


For the "pi”, "p2", "p3" and M rj" polynomials above, coefficients of the 
polynomial have the 2's complement format (1*31*0)*2**(-32) , the same 
format as is used for the polynomial value* 

The degree of the polynomials is at least 1* 

The entry branch and link register for this subroutine is RF* The subroutine 
calls the P0LY32 subroutine by way of register RF* The P0LY32 subroutine , in 
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turn, calls the subroutine, MULT32*MS, as an internal subroutine (i*e*, no BAL 
register is used) * 

Registers directly required by this subroutine are marked with a H#H * 

Registers indirectly required by the P0LY32 routine are marked with a "#"* 
Registers indirectly required by the MULT32*MS routine are marked with a "$ w * 


Register: i 
1 

RE! RDl 

t > 

RC | 

i 

RBI 

i 

RA| 

i 

R9) 

r8 : 

i 

R7! 

t 

R6i 

i 

R5i R4! 

t i 

R3i 

1 

R2| 

i 

R1 * 

i 

RO 

1 

ATANM Use: i 

1 t 

• j « ! 

i 

• i 

1 

ft j 

i 

ft { 

• ; 

i 

ft j 

1 

ft j 

1 

ft j 

1 1 

ft j { 

1 

ft j 

i 

• : 

i 

ft j 

* 

POLY32 Use: i 

i # : 

i 

i 

i 

i 

# : 

# i 

# : 

# ! 

# : 

1 1 
1 1 

1 

1 

# i 

i 

i 

# 

MULT32 Use:| 

: $ : 

i 

» 

1 

$ : 

$ i 

i 

i 

i 

i 

i 

i 

i $ i 

$ I 

$ i 

$ : 

$ 


ON ENTRY: 
R9=Xlo 
RB=Xhi 

ON EXIT: 
R0=Ylo 
R2=Yhi 
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ATANF entry 


1* 

ATANF entry 

♦ 











Register: i 

t 

RE! 

t 

RD| RC ! RB i 

i i 

RAi R9! 

j j 

R8j 

1 

R7! 

R6! R5! 
! ! 

R4 1 

i 

R3! 

1 

R2 ! 
! 

Rli ROj 

i i 


i 

Use: i 

( 

t 

i 

i i 

! iXhi 

iXlo! 

* 

1 


1 1 
1 1 

1 

1 

1 

i 

i t 

i i 

i i 

2* 

ROsX'4000 1 

* 

(Capture complement of true 

exponent 

sign bit 

in 

R0*) 


Register: 1 

i 

RE! 

i 

RDi RC' RB 1 

i 

RA! R9! 

j j 

R8 } 

t 

R7! 

R6! R5i 

i i 

R4 i 

i 

R3! 

1 

R2 ! 

i 

Rli R0! 

j j 


Use: i 

i 

i 

t 

i i 

! IXhi 

! Xloj 

i 

» 

j 


i i 

t i 

i i 

1 

1 

1 

1 

I 

1 

1 

1 

1 

{Use! 

3* 

R0=R0 *AND. 

RB 

(Complement of sign 

of 

X true exponent 

in R0 

after step*) 


Register: i 

i 

RE! 

i 

RDi RC i RB 

i t 

RA| R9! 

j j 

R8! 

t 

R7! 

R6! R5! 

i i 

R4 ! 

i 

R3i 

i 

R2i 

t 

Rli R0 ! 

i i 


1 

Use: j 

t 

t 

i 

> i 

! IXhi 

! Xlo ! 

i 

t 

i 


i i 

i i 

i i 

1 

1 

1 

i 

» 

i 

1 

1 

1 

iSEX! 

4. 

IF R0=0 (Split 

processing based on 

true exponent ( 

-) or (+, 

OK 

) 


Register: | 
1 

RE! 

i 

RD! RC| RB 

t t 

RA! R9 ! 

i i 

R8 ! 

R7! 

R6! R5! 

l i 

R4 i 

R3i 

1 

R2 1 

i 

Rl! R0! 

t I 


Use: ! 

i 

i 

t 

i i 

i iXhi 

! Xlo! 



i i 

i i 

i i 


1 

1 

1 

i 

i 

i 

1 I 

1 1 


(The true 

X exponent 

must lie 

in 

the 

range, - 

128 

<= EXP- 128 <= -1 


to branch in this direction* 

Develop sign bit in R2(0) and "0" in other 15 bits*) 
4*0 R2=X‘8000' * 

R2=R2 »AND» RB * (Sign bit now in R2(0).) 


Register: ! 

i 

RE! 

i 

RD! 

t 

RC 

! RB! 
1 1 

RA| 

j 

R9 ! R6| 

j j 

R7i 

i 

R6! 

R5! 

1 

R4 ! 

i 

R3 

R2 ! 
1 

Rii 

i 

RO 

i 

Use: i 

i 

i 

i 

i 

1 


1 1 
IXhi! 

t 

i 

Xlo! ! 

i 

i 

i 


1 

! 

i 

i 

i 


1 

SX! 

» 

* 

i 

- 

RB=RB 

♦EXCLUSIVE 

OR* 

R2 

• (Zero' 

s out 

X 

sign bit 

in 

RB; 









X becomes 

IX! 

*) 






Register: i 

RE! 

RD! 

RC 

! RB! 

RA! 

R9! R8| 

R7! 

R6! 

R5! 

R4j 

R3 

R2| 

R1 i 

RO 

! 

i 

i 

I 


i !X | i 


! X! i ! 

1 

1 


i 

i 

t 

i 


1 

1 

i 

i 


Use: | 

i 

i 

i 

i 


! hi! 


lo 1 1 

1 


i 

i 

i 

t 


sx! 

i 

i 

- 


(Determine if X true exponent is less than -11*) 
4*4 ROsRB • (Replicate |X| high*) 
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RO=RO-(( 128-11) *128) * (Remove exponent bias; add 11 to result*) 

ROsRO .AND* X'FF80' * (Clear mantissa bits out of RO; RO value is 

(true X exponent + 1 1 ) * 1 28 ; final result 

is Tl *) 


Register:! RE! RD ! RC| RBj RA| R9l R8| R7l R6! R5l R4| R3! R2l R1| RO 


Use: 


i!Xi 
hi! 


i!Xi! 
I loi 


! SXi 


Tl 


4*6 If RO=negative 

(X true exponent is less than -11, Tl < 0; Y*X.) 

4.6*0 R2=R2 .OR. RB * (Re-insert sign bit into X high in R2. YhisXhi. 

R0=R9 . (YloaXlo.) 

REsO • (Set status.) 

RETURN (by way of RF). 


Register : } 

REj 

RD! RC! RBj 

RAj R9! R8! R7! R6! R5! 

R4 ! R3! R2! 

Rli R0| 

j 

1 

1 

! ! ! X ! i 

1 lyl 1 1 1 1 l 

MAII 1 1 1 1 

1 1 t 

1 1 1 

i i 

i t 

Use: ! 

oj 

1 ! hi! 

i lo! j ! ! i 

! ! Yhi! 

I Yloi 


Else 


(X true exponent is greater than or a to -11 but less than 0* 
(0 <s Tl < 11)*) 


Register: ! 

REj 

RD! 

RC! RB,' 

RAj 

R9! 

R8! 

R7! R6 i R5! R4! R3 

R2| 

Ri ! 

ROi 


i 

i 

i 

i 

M v 1 > 
mam 

i 

i 

ixi ! 

t 

1 

1 t 1 1 

1 1 1 1 

i 

i 

1 

1 

1 

t 

Use: ! 

i 

i 

t 

i 

! hi! 

i 

i 

lo! 

I 

1 

1 1 1 1 
lilt 

sxi 

1 

1 

T1| 

1 

(RO 

, R1) 

=R0*X , 0400» 

(card) * 

(Let E2F=2*(X true 

exponent 

+ 11) 








put E2B into RO*) 




Register: ! 

REj 

RD! 

RC! RBj 

RA! 

R9! 

R8! 

R7 i R6! R5! R4! R3 

R2 ! 

RI ! 

ROi 


i 

t 

1 

M y M 

1 1 A | | 

MX!! 

1 

1 

i ! ! i 

i 

i 

i 

i 

i 

t 

Use: j 

i 

i 

i 

i 

! hi! 

1 

1 

lo! 

1 

1 

1 1 1 1 

1 1 I 1 

sx! 

i 

E2B J 


(Put X true mantissa magnitude into RC, RA.) 

4*6*3 (RB, RC)=RB*X f 0100' (card) * (Put X true mantissa (high) into 

RC; radix point is on left edge 
of RC*) 

(R9, RA)=R9*X'0100' (card) ♦ (Put X true mantissa (low) into 

R9, RA.) 

RC=RC .OR. R9 * (Concatenate high mantissa bits.) 

RC=RC *0R* X'8000' . (Insert lead "1" bit into X mantissa.) 


Register: ! 

RE! RD! RC! 

RBj RA! R9i R8! 

R7! R6! R5! R4! R3! R2i 

Ri ! 


! i mXi 

I mX j i ! 

t i i i i i 

i » t i ■ i 


Use: i 

! ! hi! 

t 

0 

1 

1 1 t 1 1 qy 1 

1 1 1 1 1 OA | 

- i 


(Replicate true mantissa (magnitude); put (RC,RA) into (R5,R3)») 
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4*6*5 R5sRC * 

R3*RA « 


Register: I 

REI RD | RC j 

RBI RAl R9l R8| R7i R6| 

R5l 

R4| R3I 

R2 1 

Rli RO 

1 

1 

1 mX| 1 I 1 I 

mXi 

1 mXl 

1 

i 

i 

Use : i 

i 1 hi | 

- 1 lol - 1 I 1 1 

hil 

1 lol 

SXi 

- 1E2B 


(Make integer X magnitude* Its radix point is to be at the left 
edge of R6* If R6, R5, and R3 are considered contiguous, 
then the true X mantissa magnitude has been multiplied by 
2**(-l6)* Instead, to make |X! an integer, it should have been 
multiplied by 2**(tru exp)* Thus, the (R6, R5, R3) value should 
be multiplied by 2**(tru exp + 16) or, since the RO value 
divided by 2 is (tru exp + 11), by 2**((R0value/2)+5)* The 
values of RO range from 2*0 to 2*10* For the various values of 
RO, the values of 2**( (R0value/2)+5) will be pulled from the 
MCU memory table, SHF « The table follows: 

SHF Table 


Location 

' R0 

2**(R0/2) 

SHF 

+ 0 

2*0 

X'0001' 

SHF 

+ 2 

2*1 

X'0002' 

SHF 

+ 4 

2*2 

X’0004' 

SHF 

♦ 6 

2*3 

x'ooos* 

SHF 

+ 8 

2*4 

X’0010' 

SHF 

+10 

2*5 

X'0020' 

SHF 

+12 

2*6 

X’0040* 

SHF 

+14 

2*7 

X’0080' 

SHF 

+16 

2*8 

X’OIOO' 

SHF 

+18 

2*9 

X‘0200' 

SHF 

+20 

2*10 

X'0400' 

SHF 

+22 

2*11 

X»0800» 

SHF 

+24 

2*12 

X'1000* 

SHF 

+26 

2*13 

X'2000' 

SHF 

+28 

2*14 

X‘4000' 

SHF 

+30 

2*15 

X’8000* 


4*6*7 R0=R0+SHF5 ♦ (SHF5=SHF+2*5 . Point to correct location in 

shift table* PNT is the pointer value* 
E+Ss2*(true ! X 1 exponent+1 1 )+SHF5*) 

R1s0(R0) ♦ (Put value in memory location referenced by RO into 

HI*) 


Register: | 

RE! RD i RC ! RB| 

RA| R9 

R8| R7i 

R6 

1 R5 

R4! R3i 

R2| R1 

R0 

1 

1 1 mX I I 

mXl 

1 1 


I mX 

1 mX 1 

j 


Use: | 

1 1 hil - 1 

lol - 

1 1 


I hi 

I lol 

SX| PNT 

E+S 

4*6*9 

(R5, R6)sR5*R1 

(card) 

. (Shift 

X 

true 

mantissa 

high.) 



(R3, R4)sR3»R1 

(card) 

* (Shift 

X 

true 

mantissa 

low*) 
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R3*R3 *OR* R6 ♦ (Concatenate low parts of Integer |X|; integer 
}X| is in (R5« R3) after step*) 

Register:! RE| RD] RC| RB| RA| R9! R8! R7I R6| R5l R4| R3) R2I R1 1 P.O! 

I I I mXi I bXI I I ! lint! lint! I ! I 

Use:| 1 { hi| - | lol - | j I - I |X| | - i |X! I SX|PNTiE+S! 

) } > i i i ! i > ! hi} | lot t ! } 

(Split processing where X true exponent is less than -5*) 

4*6*B R0=R0-(SHF5+2*( 1 1-5) ) ♦ (R0/2*tru exp + 5; E25*2»(tru exp + 5)*) 

Register:) RE| RD! RCi RB} RAi R9! R8! R7i R6i R5i R4i R3! R2i R1 ! ROi 

! I ! nX! I nX! ! I I lint! lint! ! I ! 

Use:! i ! hi! - I lol - I i I - I IX! I - ! IX! ! SXi - 1E251 

! I I I I I I I I I hi! ! lo! I I I 

4«6«D If ROsnegative 

(X true exponent is less than -5 but greater than or 
equal to -11* Use YsX*( 1-( 1/3)*X**2) for computing Y* 
Square integer 1X1 1st*) 

4*6*D*0 (R3, R4)=R3»R5 (card) * (Multiply lo times hi (int X).) 

(R5, R6)sR5*R5 (card) * (Multiply hi times hi (int X)*) 
R6sR6+R3 ,save carry out • (Partial square lo*) 
R5sR5+carry in * (Partial square hi*) 

R3*R3+R6 ,save carry out * (Full square lo*) 

R5=R5+carry in • (Full square hi*) 


Register: | 

RE! RD| RCi 

RB! RAI R9I R8j 

R7! R6j R5 ! 

R4 i R3! 

R2I 

Rl! RO 


! 1 mXi 

! mXi t 1 

! linti 

’int! 

t 

t 

t 

i 

Use: ! 

! ! hi! 

- ! lo! - ! i 

1 - !!XI! 

- ! ! x ! ! 

sx; 

•* ! E25 


1 I i 

I ! 1 1 

! ! sqr ! 

isqr! 

• 

i 

i 

i 


! 1 ! 

! 1 ! 1 

! ! hi! 

! lo! 

! 

1 


(Square of integer { X < is in (R5, R3); radix point is at 
left edge of R5* Multiply square times ( 1/3) a *3333*** and 
call the result "D" ♦ Then subtract "D" ( *3333*** # X»»2) 
from 1 and call the result C*) 

4*6*D*2 (R3, R4)*R3*X , 5555* (card) * (Multiply lo (int X sqr) 

times both (1/3) hi and 
(1/3) lo* The hi and lo 
part of (1/3)sX«5555' ; 
radix point of (1/3) hi 
is at left edge of 
at X'5555’ *; 

(R5, R6)=R 5n'5555' (card) * (Multiply hi (int X sqr) 

times both (1/3) hi and 
(1/3) lo*) 

R6sR 6+R5 ,save carry out * (Partial multiply, lo*) 
R5*R5+carry in * (Partial multiply hi*) 

R3=R3+R6 ,save carry out * (Full multiply, Dio*) 
R5»R5+carry in * (Full multiply, Dhi*) 
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Register:) RE) RD| RCi RB| RA| R9I R8| R7l R6t R5| R4) R3) R2| R1 I ROl 

I I I mX) I mXI I I I I I I I I i ! 

Use:! i I hi) - | lol - I I I - iDhll - |Dlo| SXi - |E25) 

("C" is now in (R5, R3)* Develop C«1-D*1-(X»»2)/3 ♦ ) 

4*6«D*4 R5«*NOT* R5 * (1-(X»*2)/3 hi*) 

R3»*NOT* R3 ♦ ( 1-(X»*2)/3 lo* "C" is now in (R5, R3)*) 
Register:) RE) RD) RCi Rfi) RA) R9) R8) R7l R6) R5) R4| R3) R2i R1) ROl 

I I ! mX) ) oX| | ) ) t ) ! t ) ! i 

Use:) I ) hi) - { lol - I ) | - I Chi I - )Clo| SXI - )E25) 


(Multiply the X true mantissa times n C" to get the value, 
mX»(1-(X»*2)/3)* Call the result mV* Put into (R5, R3)*) 
4*6*D*6 (RA, RB)«RA»R5 (oard) ♦ (Multiply mXlo times Chi*) 

(R5, R6)=R5 # RC (card) • (Multiply Chi times mXhi*) 
RA=RA+R6 ,save carry out ♦ (Partial multiply, lo*) 
R5*R5+carry in * (Partial multiply hi*) 

(R3, R4)sR3*RC (card) * (Multiply Clo times mXhi*) 
R3*R3+RA , save carry out « (Full multiply; rnVlo*) 
R5*R5+carry in ♦ (Full multiply; mVhi*) 

Register:) RE) RD) RC) RB) RA) R9l R8 ) R7) R6| R5l R4) R3) R2 i R1) ROi 

) I I I ! I I i I I mV) ) mV) ) ) ) 

Use:) I I - I - I - I - I j | - | hi) - I lo) SX) - )E25l 


4*6*D*8 


4 *6 *D*8*0 


4*6*D*8*1 


(If R5 lead bit is 1 (i*e«, if it looks negative, the 
value is in the mantissa rango of the output Y* Else, 
mV is a hair below *5 and needs to be multiplied by 2* 
Fix and produce output*) 

If R5=megative 

(Pseudonegative branch direction; really, the mV 
magnitude is s to or > *5 *) 

ROsRO+2 # ( 128-5) (Biased exponent of Y times 2; 

need to multiply it by 64*) 
(R5, R6)*R5*256 (Y true mantissa hi 8 bits 
properly positioned in R5*) 
(R3, R4)sR3»256 (Y true mantissa lowest 16 bits 
properly positioned in R3*) 


Else 

R0sR0+2*( 128-6) (Biased exponent of Y times 2; 

need to multiply it by 64*) 
(R5, R6)*R5 # 512 (Y true mantissa hi 8 bits 
properly positioned in R5*) 
(R3, R4)*R3*512 (Y true mantissa lowest 16 bits 
properly positioned in R3*) 


End (4*6*D*8) If* 
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(Fix exponent*) 

4*6*D*A (RO, R1)sR0*X'0040' * (Proparly positioned biased 

Y exponent is now in R1«) 

(Fix mantissa*) 

R3«R3 ♦OR* R6 . (Merge lo bits of Y mantissa; Ylo.) 
R5sR5 .AND. X'FF7F' (Clear lead mantissa bit.) 

R2»R2 .OR. R5 (Sign and biased mantissa merge.) 

R2*R2 .OR. R1 (Merge biased Y exponent into Yhl.) 
R0*R3 (Move Ylo into R1.) 

RE*0 . (Status.) 

RETURN (by way of RF). 

Register:! REI RDi RCi RBI RA| R9 t R8i R7I *61 R5i R4| R3 I R2l R1 ! RO ! 

I I ) I I I I ! I » I 1 i I ! I 
Use:} iYhii - ! Ylo! 


Else 

(X true exponent is less than 0 but greater than or 


equal to -5. Use Y*X*(1-G(X)) for computing Y.) 


Register: S 

REI RD| RCI R5| 

RA| R9J 

R8| R7I R6| R5I 

R4| R3! 

R2I 

Rll RO 

1 

1 1 

mXl | 

I 1 lint! 

lint} 

» 

i 

1 

1 

Use: 1 

1 1 hil - 1 

lol - | 

I 1 - MX| I 

- IIXII 

SX! 

- IE25 

1 

ill! 

1 I 

1 1 1 hi} 

I lol 

I 

1 


4.6.D. 1 


R6*RA . (Save X true mantissa low in R6.) 

RB*R0 .. (Save 2*(X true exponent + 5) in RB.) 

RE=R2 . (Save X sign in RE.) 

R9*R5 . (Move int X hi (Uhi) into R9.) 

RA*R3 . (Move int X lo (Ulo) into RA.) 

RDsO . (Load sign bit of U, a magnitude, into RD.) 

R7*N1 (Set degree for calculating "p1 M where 
p1*G(X)*(1-ATAN(U)/U) .) 

R8sC0EF1+4*N1+2 (Set to last 16 bit address of C0EF1 
block.) 


Register-: 1 

REI 

RDI 

RC| RBI RA | R9i RB t R7l 

R6| R5I 

R4| R3I R2I R1| RO 



1 

mX| I inti inti I I 

mXl I 

1 1 1 1 

Use : ) 

SX| 

SUj 

hi I E25 1 IX} 1 1 X I Hoc lent! 

lol - 1 

-l-l-l-l- 


1 

j 

1 

I 1 lol hi} I | 

1 t 

1 1 

I | 

III! 

III! 


• 

1 

i 

I 

i i * i * i i i 

I lUlojUhil I I 

t 1 

I I 

till 

till 

lilt 


D.3 


BAL,RF POLY32. 

RDI RCI RBI RAI R9l 

(Compute "pi" 

polynomial.) 


Register: 1 

REI 

R8| R7I R6I R5 

R4 1 R3I R2I R1| 

RO! 

i 

i 

1 

1 

0! mXl lint lint! 

1 1 mXl 

1 I pi 1 I 

Pi! 

Use: | 

SXI 

SU| hi|E25llXl IIXII 

- ! - ! lol - 

- 1 - 1 hil - I 

lol 

1 

1 

i 

1 I I lo! hil 

1 1 1 

III! 

i 

i 

1 

I 

1 I 1*1*1 

1 I 1 

tilt 

» 

i 

I 

1 

1 I | Ulo I Uhi 1 

I I I 

III! 

! 
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(Generate C*(1-G(X)) ,) 

4.6. D.5 R2«.N0T. R2 . (0(X)*p1. Radix point of pi ia at left 

edge of R2. Complement of (R2, RO) 
ia (1-G(X)) as an unaigned number; Chi.) 

R0«.N0T. RO . (Low part of (1-G(X)); Clo.) 

Register: t RE! RD| RC! RB| RAt R9l R8 ! R7I R6I R5i R4I R3I R2| R1 I RO! 

I I 01 mX| ! I t I ! mXI i i j I i ! 

Uae:! SX| SU! hilE25! - ! - ! - 1 - I lo| - I - ! - iChil - |Clo| 

(Multiply the X true mantissa times "C" to get the value, 
mX*( 1-G(X) ). Call the result mV. Put into (R2 f RO).) 

4.6. D.7 (R6, R7)*R6»R2 (oard) . (mXlo»Chi.) 

(RO, R1)*R0*RC (card) . (Clo»mXhi.) 

(R2, R3)«R2»RC (oard) , (Chi«mXhi.) 

R6*R6+R3, save carry out. (Combine lower bits of partial 

product, mV*mX*( 1-G(X) ).) 

R2*R2+ carry in . (Combine upper bits of partial 

product, mV«mX a (1-G(X)).) 

H0*R0+R6, aave carry out. (Combine lower bita of complete 

product, mVsmX*( 1-G(X) ).) 

R2*R2+ carry in . (Combine upper bita of complete 

product, mV«mX*( 1-G(X) ). ) *'■ ) 

Register:! RE| RDI RC! RB t RA| R9! R8 1 R7i R6| R51 R4 ! R3l R2i R1{ R0| 

I I 01 ! ! I I ! I ! I i ! mV! I mV! 

Uae:! SX\ SU! - IE25I -!-!-!-!-!-!-! • ! hi! - ! lo! 

(If R2 lead bit ia 1 (i.e. , if it looks negative, the 
value is in the mantissa range of the output Y. Else, 
mV is a hair below .5 and needs to be multiplied by 2. 

Fix and produce output.) 

If R2*negative (Pseudonegative; really a magnitude.) 

(Pseudonegative branch direction; really, the mV 
magnitude is * to or > .5 .) 

RB«RB+2*( 128-5) (Biased exponent of Y times 2; 

need to multiply it by 64.) 

(R2, R3)*R2»256 (Y true mantissa hi 8 bits 
pr perly positioned in R2.) 

(K0, Rl)*R0 a 256 (Y true mantissa lowest 16 bita 
properly positioned in RO.) 

Else 

4. 6. D. 9.1 RBiRB+2*( 128-6) (Biased exponent of Y times 2; 

need to multiply it by 64.) 

(R2, R3)*R2 # 512 (Y true mantissa hi 8 bita 
properly positioned in R2.) 

(RO, R1)sR0*512 (Y true mantissa lowest 16 bits 
properly positioned in RO.) 


4.6. D.? 

4.6. D.9.0 
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End (4.6.D.9) If. 

(Fix exponent,) 

4.6.D.B (RB, ROsRB^X'OOMO' . (Properly positioned biased 

Y exponent is now in RB,) 

(Fix aantissa.) 

ROaRO .OR, R3 . (Merge lo bits of Y mantissa; Ylo.) 
R2aR2 .AND. X'FF7F 4 (Clear lead aantissa bit.) 

R2aR2 ,0R. RE (Sign and biased mantissa merge.) 

R2&R2 .OR. RC (Merge biased Y exponent into Yhi. ) 
REaO . (Status.) 

RETURN (by way of RF), 

Register:! RE) RD| RC! RB! RA| R9! R8J R7! R6! R5l R4! R3l R2| R1 I ROl 

I I ) I !!!!!!! I I I I ! 

Use:! ! Yhi! - lYlol 


End (4.6.D) If. 


Else 

( 



i 


- 3-55 - 


MPP SCIENTIFIC SUBROUTINES 


GOODYEAR AEROSPACE 
CORPORATION 
GER-17221 


ATANF entry 


(X true exponent is 0 or + starting here.) 

Register:! REi RDj RC| RBl RAi R9! R8i R7i R6! R5l R4j R3! R2| R1 { RO| 

i I I I i 1 ! i ! ! i i ! ! ! ! 

Use:! ! i SXhil iXlo! 1 !!!!!!!- i 

4.1 REsX'SOOO' . (Prepare to capture X sign in RE(0); clear other bits.) 

REsRE .AND. RB . (Sign of X in RE; other RE bits cleared.) 

RBsRB .EXCLUSIVE OR. RE . (X changed to a magnitude, !Xi.) 


Register: ! 

RE! 

RD! RC! RB! 

RAi R9! R8! R7! R6 1 R5! R4i R3i 

R2 1 R1 i ROi 

S 

i 

i ! ix! { 

tlXii i ! i ! i ! 

i i i 

i t i 

Use: ! 

sxi 

! ! hi! 

! lo! ! i ! i ! ! 

i i _ i 


4.3 (RB, RC)sRB»X'0200' . (!x! biased mantissa hi left justified in RC; 

! X § biased exponent, EXP, is in RB, right 
Justified.) 



IR9, 

RA)=R9*X , 0200' 

. ( ! x ! 

biased mantissa lo 

left justified 




(R9, 

RA).) 



R9=R9 

.OR. RC . (!X 

biased 

mantissa hi merged 

into R9.) 

Register: 

RE! 

RD! RC! RB! RAi 

R9i R8i 

R7! R6i R5i R4i R3 

! R2 i Rl! R0! 


i 

i 

! i i mXi 

mXi ! 

1111 
1 1 1 i 

1111 

i 1 l J 

Use: 

sxi 

i _ !EXP1 lo! 

hi! ! 

till 

tilt 

1 1 1 1 

» 1 1 - 1 


4.5 RB=RB-128 . (Unbiased exponent=EX. ) 

4.7 If RBsO 


(True exponent of |Xi is 0 along this path. Polynomial to be 
used is setup for the i X S interval, .5 <= !Xi < 1.0 .) 


Register: i 

RE! RD} 

RC! RB! 

RA! 

R9i R8i R7! 

R6! R5! R4 j R3! R2 ! Rl i R0 ! 


I ! 

i i 

i i 

mXi 

mXS i ! 

i i i i i i i 

i i t i i i i 

Use: ! 

sx! i 

- i EX! 

lo! 

hi! i ! 

i i t i i i i 

i i i i i i - i 


(Bias down mantissa by an additional .25; then, describe 
the resultant U as a magnitude with sign in RD.) 

(Create true | X i mantissa-. 75 . Then conceptually multiply 

result by 2 to create U. Radix point will then be at left 
edge of R9.) 

4.7.0 RDs.NOT. R9 . (Lead bit of RD, RD(0), now contains sign bit 

of U.) 

RDsRD .AND. X'8000 r , (Clears all but U sign bit.) 

4.7,2 If RD=0 

(Sign bit of U is 0, i.e., +,) 

4. 7. 2.0 R9=R9 .EXCLUSIVE OR. X’3000’ . (Clear lead bit of R9 
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when true mantissa-,75 is 
+; creates Uhi, Ulo 
already exists, Uhi, Ulo 
is the magnitude of U,) 

Else 


4,7,2, 1 R9*R9 , EXCLUSIVE OR, X'7FFF‘ , (Clear lead bit of R9 

when true mantissa-,75 is 
-; complement remaining 
bits of R9 to create Uhi, 
Now proceed to complement 
RA which becomes Ulo. Uhi 
Ulo is the magnitude of 
U.) 

RAsRA .EXCLUSIVE OR. X'FFFF' . (Complement complete.) 


Register: ! 

RE! 

RDi 

RCi 

RB! 

RA 

R9 ! R9! R7i R6! R5i R4 } R3i R2 ! R1 ! 

i i i i • i i i i 

R0 ! 

Use: i 

SXi 

su! 

i 

• i 

EX! 

Ulo 

i i i i i i i i i 

Uhi! !!!!!!!! 

- ! 


End 

(4. 

7.2) 

If. 





4.7.4 R7*N2 (Set degree of polynomial for calculating "p2 w , 

p2= ATAN ( ARG ) - . 5 , ARG=(U+1 ,5)/2 .) 

R8=C0EF2+4*N2+2 (Set to last 16 bit address of C0EF2 block, 
the coefficients of p2.) 

R3! R2 } R1 ! ROi 

III) 

I I I I 

lilt 
I I 1*1 


BAL,RF P0LY32. (Compute n p2 n polynomial.) 


Register: ! 

RE! 

RD! 

RC| RB 

» 

R9 ! R8! R7! R6! R5i 

R4 ! R3! R2 ! R1 ! 

R0 


i 

i 

» 

» 

i 

i 


i i i i i 

i i i i i 

! ! p2 i ! 

P2 

Use: ! 

sx! 

su! 

- ! EX 

Ulo 

Uhi! - ! - 1 I I 

- ! - ! hi! - ! 

lo 


(If R2 lead bit is 1 (i.e. , if p2 is negative), the value 
produces a Y mantissa that is a hair below .5 .In such case 
the final Y true exponent must be -1 and the bits of p2 must 
be shifted left 1 bit position more than normal. Else, when p2 
is positive, the Y true exponent is 0; a normal alignment 
shift is required.) 

(Fix p2 and produce output.) 

If R2=negative 

(The p2 value is negative branch direction.) 

(R2, R3)=R2*512 (Y true mantissa hi 8 bits 
properly positioned in R2.) 

(R0, R1)=R0*512 (Y true mantissa lowest 16 bits 
properly positioned in R0,) 

R2=R2 .OR. X'3F80’ . (Merge Y biased exponent of 127 


4.7.6 
4. 7. 6.0 


Register: } 

RE! 

i 

RD! 

1 

RC! RB 

i 

RA! 

i i 

R9! 

1 

R8! 

i 

R7! R6! R5! R4 

i 1 1 

Use: ! 

SX! 

1 

su! 

- ! EX 

! Ulo ! 

Uhi! 

loci 

1 1 1 

cnt! ! ! 
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into Yhi.) 


Else 

(The p2 value is ,5 or somewhat greater branch 
direction, ) 

4,7.6, 1 (R2 , R3)=R2*256 (Y true mantissa hi 8 bits 

properly positioned in R2.) 

(RO, R1)=R0*256 (Y true mantissa lowest 16 bits 
properly positioned in RO.) 

R2=R2 ,0R, X^OOO’ , (Merge Y biased exponent of 128 

into Yhi,) 

End (4,7.6) If. 


(Fix mantissa.) 

4.7.8 ROsRO .OR. R3 . (Merge lo bits of Y mantissa; Ylo.) 

R2=R2 .OR. RE . (Merge Y sign into Yhi.) 


RE=0 . (Status.) 

RETURN (by way of RF). 


Use 


R2 ! 
* 

Rl! RO 

i 

1 

Yhi! 

i 

- ! Ylo 


Else 

(True exponent of |X| is greater than 0 along this path. 

It will be necessary to differentiate the S X i true exponents+1 
case from the cases for which the exponent is greater..) 


Register: i 

RES RD * RCS 

RB! 

RAi 

R91 R8i R71 R6i R5! R4 5 

R3S R2S Rl 1 ROS 


i i i 

i i i 

i 

i 

mX! 

mX! ! ! ! ! i 

i i i i 

i i t i 

Use: i 

sxs i - | 

EX! 

lo! 

hi! ! ! i ! i 

i i i _ i 


(Develop EX2=EX-2.) 

4.7.1 RB=RB-2 . ( , RB contains the { X I true exponent -2.) 


Register: S 

RE! 

RD! RC! RB! 

so 

R9i R8i R7! R6! R5i 

R4 | R3! R2l Rl I ROi 


i 

i 

i i i 

i i i 

mX! 

mX! i ! ! ! 

i i i i i 

t i i i i 

Use: ! 

sx! 

i - i EX2 ! 

lo! 

hi! i ! ! ! 

t i i i i 

i i i i - i 


4.7.3 If RBsnegative 

(The { X i true exponent is +1 to branch in this direction. 
Polynomial to be used is setup for the i X ! interval, 

1.0 <= !Xi < 2.0 .) 

(Bias down mantissa by an additional .25; then, describe 
the resultant U as a magnitude with sign in RD, i.e. , 
create (true |X| mantissa-. 75) . Then conceptually 

multiply result by 2 to create U. Radix point will then 
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4. 7. 3.0 

4 , 70-2 

4.70.2.0 


4,70-2,1 


Register: j 
Use: ! 


be at left edge of R9.) 

RDs.NOT, R9 , (Lead bit of RD, RD(0) f now contains sign 
bit of U,) 

RDsRD .AND. X'8000' , (Clears all but U sign bit,) 

If RD=0 

(Sign bit of U is 0, i.e. , +.) 

R9=R9 .EXCLUSIVE OR. X’8000’ . (Clear lead bit of 

R9 when true mantissa-. 75 
is +; creates Uhi, Ulo 
already exists. Uhi, Ulo 
is the magnitude of U.) 

Else 

R9=R9 .EXCLUSIVE OR. X'7FFF’ . (Clear lead bit of 

R9 when true mantissa-. 75 
is -; complement remaining 
bits of R9 to create Uhi. 
Now proceed to complement 
RA which becomes Ulo. Uhi, 
Ulo is the magnitude of 
U.) 

RA=RA .EXCLUSIVE OR. X'FFFF' , (Complement done.) 


RE! 

i 

RD| 

i 

RC! RB 

i 

RA! R9! R8! 

t > ! 

R7! R6{ R5i M\ R3l R2| R1 1 ROl 

i i i i i i i i 

X 

to 

i 

SU| 

- |EX2 

t 1 l 

Ulo! Uhi! ! 

i i i i i i i i 

i j » j i i | - J 


End (4.70.2) If. 

4. 7. 3. 4 R7=N3 (Set degree of polynomial for calculating "p3” , 

p3 s (ATAN(ARG)/2)-.5, ARG=(U+1.5)/2 .) 

R8sC0EF3+4*N2+2 (Set to last 16 bit address of C0EF3 
block, the coefficients of p3.) 


Register: ! 

RE! 

i 

RD 

RC! RB! 

i i 

RA! R9! 

I 1 

R8! R7! 
1 1 

R6! 

1 

R5! 

R4 1 

i 

R3! 

1 

R2 } 

i 

R1 

1 

1 

t 

RO! 

Use: ! 

sx! 

SU 

I ! 

- ! EX2! 

1 1 
Ulo! T ” M 

1 t 

locicnt! 

1 

1 

1 


i 

i 

i 

1 

1 

1 

l 

I 

l 


1 

1 

1 

- ! 



BAL,RF 

P0LY32. 

(Compute " 

P3" 

polynomial. ) 




Register: ! 

RE! 

RD 

RC! RBI 

RA! R9! 

R8! R7! 

R6J 

R5 ! 

R4 j 

R3! 

R2 

R1 

1 

1 

RO! 


i 

i 


! 1 

1 1 
1 1 

i i 

i i 

1 

1 


i 

» 

1 

1 

P3! 


1 

1 

p3! 

Use: ! 

sx! 

SU 

- !EX2! 

Ulo! Uhi! 

- ' - 1 

I 

1 


— i 

— I 

hi! 

- 

1 

1 

loi 


(If R2 lead bit is 1 (i.e. , if p3 is negative), ohe value 
produces a Y mantissa that is a hair below .5 . In such 

case the final Y true exponent must be 0 and the bits of 
p3 oust be shifted left 1 bit position more than normal. 
Else, when p3 is positive, the Y true exponent is 1 ; a 
normal alignment shift is required.) 

(Fix p3 and produce output.) 

4. 7. 3. 6 If R2*negative 
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(The p3 value is negative branch direction,) 

4.7. 3. 6.0 (R2, R3)sR2»512 (Y true mantissa hi 8 bits 

properly positioned in R2,) 

(R0 f R1)sR0 # 512 (Y true mantissa lowest 16 bits 
properly positioned in RO,) 
R2sR2+( 125*128) , (Develop Y biased exponent of 

128 in R2,) 

Else 

(The p3 value is ,5 or somewhat greater branch 
direction,) 

4. 7. 3. 6.1 (R2, R3)=R2*256 (Y true mantissa hi 8 bits 

properly positioned in R2.) 

(RO, R1)=R0 # 256 (Y true mantissa lowest 16 bits 
properly positioned in RO,) 
R2sR2 ,0R, X'4080' , (Merge Y biased exponent of 

129 into Yhi, ) 

End (4, 7, 3, 6) If, 


4, 7, 3,8 


Register: 

Use: 


(Fix mantissa.) 

R0=R0 .OR. R3 . (Merge lo bits of Y mantissa; Ylo. ) 
R2sR2 .OR. RE . (Merge Y sign into Yhi.) 

REsO . (Status.) 

RETURN (by way of RF). 

RE! RD| RC! RB| RAJ R9l R8| R7! R6! R5l R4J R3i R2! R1 ! ROi 
III!!!!!!!!!!!! 
-!-!-!-!-!-!-!-!-!-!-!- ! Yhi ! - !Yio! 


Else 

(The S X } true exponent is greater than +1 to branch in 
this direction. The 'Xi values will be 2.0 or greater. 
W ! in order 

to get suitable expansions for the ATAN function. 
Proceed to find the reciprocal of !X|. 

Register:! RE S RDj RC! RB! RAJ R9l R8! R7l R6| R5‘ R4' t R3! R2| R1 ! RO ! 

ill!! mXi mX! !!!!!!!!! 
Use:! SX! ! - ! EX2 J lo| hi! ! ! ! ! ! ! ! ! - ! 


4.7,3, 1 R5=R9 . (Replicate mXhi in R5.) 

(R5, R6)=R5*X , 0020' . (Multiply range index, "J", 

J*0, ... ,7 , by 32; R5=4*j 

results in bits 11,..., 13 ,) 

R5=R5 .AND. X'OOIC 1 . (Clear all but bits 11,. ,.,13 .) 
R9*R9 .AND. X’lFFF' . (Clear range interval index bits, 

"j", of mXhi. Creates Uhi in R9 
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and Ulo in RA, ) 

RDsO , (Clear sign bit of U, U is positive only,) 
RSsR5+GET , (R5 points to polynomial degree number 
in "GET" table,) 

"GET" Table 


j 

k 

4*j+k 

Address 

Value 

0 

0 

4*0+0 

GET+ 0 

M(0) 

0 

2 

4*0+2 

GET+ 2 

K0EF(0)+4*M(0)+2 

1 

0 

4*1+0 

GET+ 4 

M(1) 

1 

2 

4*1+2 

GET+ 6 

K0EF( 1 )+4*M( 1 )+2 

2 

0 

4*2+0 

GET+ 8 

M(2) 

2 

2 

4*2+2 

GET+10 

K0EF ( 2 ) +4*M( 2 ) +2 

3 

0 

4*3+0 

GET+12 

M( 3) 

3 

2 

4*3+2 

GET+14 

K0EF( 3)+4*M( 3)+2 

4 

0 

4*4+0 

GET+16 

M<4) 

4 

2 

4*4+2 

GET+18 

K0EF(4)+4*M(4)+2 

5 

0 

4*5+0 

GET+20 

M(5) 

5 

2 

4*5+2 

GET+22 

K0EF(5)+4»M(5)+2 

6 

0 

4*6+0 

GET+24 

M(6) 

6 

2 

4*6+2 

GET+26 

K0EF ( 6 ) +4*M( 6 ) +2 

7 

0 

4*7+0 

GET+28 

M(7) 

7 

2 

4*7+2 

GET+30 

KQEF ( 7 ) +4*M( 7 ) +2 


R7sO(R5); R5=R5+2 , (Load R7 with degree of selected 

polynomial, M(j), held at address 
pointed to by value in R5, Then 
bump "GET" table pointer, R5,) 
R8=0(R5) , (Load R8 with K0EF( j)+4*M( j)+2 ; the last 

16 bit address of the KQEF(j) block, the 
coefficients of the polynomial, r(J), held 
at address pointed to by value in R5,) 


Register: i 

j 

RE! 

RDi 

RC{ 

1 

RB 

RA! R9i 
| | 

R8! R7i R6i R5i R4! 

iiiii 

R3i 

R2i 

R1| 

i 

R0 

l 

Use: i 

SX! 

i 

sui 

i 

i 

- t 

EX 2 

UloiUhii 

i i i i i 

locicnti - i - ! i 



i 

i 

* 

- 

3.3 


6AL 

,RF 

POLY 32, 

(Compute "r(j) n polynomial. 

) 


Register: j 

RE! 

RDi 

RC! 

RB 

RA! R9! 

R8i R7! R6! R5i R4 } 

R3i 

R2i 

Rl! 

R0 

i 

i 

i 

i 

! 

! 


1 ! 

iiiii 

iiiii 


rji 

i 

i 

rj 

Use: I 

SX! 

sui 

- 1 

EX2 

UloiUhii 

- ! - ! ! 1 - l 

• i 

hi! 

• i 

lo 


(The value of rj is given by 

rj*1/(8*ARG) for ARG*1/2+j/l6+U/2 , 0 <s U < 1/8, 

If R2(1) bit is 1 , the final |Xj reciprocal true 
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4 , 7 , 3. 5 
4, 7.3.7 

4,7.3. 7.0 


4, 7. 3.7.1 


exponent must be -(EX2+2)+2s-EX2+0s( ,N0T. EX2)+1 ; a 
normal alignment shift is required. If R2 lead bit is 0 , 
the final I X I reciprocal true exponent must be 
-(EX2+2)+1=-EX2-1=( .NOT, be shifted left 1 bit position 
more than normal,) 

(Fix r(j) and produce output { X { reciprocal.) 

RBs.NOT. RB . (Complement EX2. ) 

If R2(1) is set (Rare case; ARGs.5 test.) 

(Normal alignment branch direction.) 

(R2, R3)sR2 # X'200' . (Y true mantissa hi 8 bits 

properly positioned in R2.) 

(R0, R1 )sR0*X*200' . (Y true mantissa lowest 16 

bits properly positioned in 
FU.) 

RBsRB+128 . (Biased exponent of reciprocal-1.) 

Else 

(Additional bit shift alignment branch direction.) 
(R2 , R3)*R2*X'400' . (Y true mantissa hi 8 bits 

properly positioned in R2.) 
(R0, R1 )sR0*X*400' , (Y true mantissa lowest 16 

bits properly positioned in 
R0.) 

RB=RB+127 . (Biased exponent of reciprocal-1.) 

End (4. 7. 3. 7) If. 


4. 7. 3. 9 


Register: | 


I 

I 


Use: I 


(Assemble X reciprocal, XR, in (RB, R9).) 

(RB, RC)=RB*X , 0080' . (Position exponent in RC.) 

RBaRC . (Move aligned, biased exponent into RB.) 
RB=RB+R2 . (Add aligned (biased exponent-1) & aligned 
hi mantissa.) 

RBsRB .OR. RE . (Merge sign, biased exponent & hi 
mantissa. ) 

ROsRO .OR. R3 . (Merge lo mantissa bits into R0 . ) 
R9=R0 . (Move mantissa lo into RB.) 

RE | RD! RC | RB| RA| R9l R8| R7i R6| R5l R4| R3l R2i HI I R0| 

I I I XRI I XRI | | | | | | | | I 

- I - I - I hi I - I lol - I - I I I - I - I - I - i - ! 


(Reciprocal of X is now in the input X slot; find the 
ATAN of the reciprocal.) 

4.7.3.B BAL,RF ATANF entry . (Get ANG. Answer is PI/2-ANG.) 

Register:] RE I RD| RC| RB j RA| R9l R8| R7l R6| R5i R4| R3l R2| R1 1 ROi 

I I I I I I I I I I I I I ANG ! iANGi 

Use: | - I - I - I - I - I - I - I - I - I - I - I - I hi! - I lol 
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(Compute PI/2-ANG. ANG is floating point,) 

4. 7. 3-D RE=X'8000' . (Prepare for sign bit isolation,) 

REsRE ,AND, R2 , (ANG sign bit in RE(0); 0 elsewhere, 

RE value is SA, ) 

R2sR2 .EXCLUSIVE OR. RE . (Absolute value of ANG, |A| t 

now in (R2, RO).) 

Register:,' REl RDl RC| RB| RAl R9l R8I R7l R6| R5l R4| R3l R2j R1 ! RO 1 

I I I I I i i I I I I I MAH MAI! 

Use:', SAj - I - I - I - ! - } - { - I - | | ! hi', - i loi 


(Put 2 times the ANG biased exponent into R2. Also, save 
!A| hi in R7.) 

4.7.3.F R7=R2 . (Save |A| in R7.) 

(R2, R3)=R2 # X'0400' . (Bits 7,. ..,14 of R2 now holds 

biased { A i exponent.) 

R2sR2 .AND. X'OIFE' . (R7 now holds 




2 4 ( biased I A I 

exponent ) = 

2EB. 

Register: i 

REi RDl RC| RBI RAl R9l 

R8| R7l R6I R5l R4| 

R3I R2i R1 

I RO 

1 

1 1 1 1 1 1 

HAj| | | | 

1 1 

1 ! A ! 

Use: 1 

01 

a* 

i 

i 

i 

i 

- I hi! - 1 - 1 - 1 

- I 2EB I - 

1 lo 


(Align j A { biased mantissa so that radix point is at 

left edge of R7; then unbias it.) 

4.7.3.11 (R7, R8)=R7*X'0100' . (Move i A i biased mantissa hi into 

R8.) 

(RO, R1)=R0*X'010O' . (Shift hi part of biased mantissa 

lo into RO and lo part into R1.) 

R8sR8+R0 . (Kerge biased mantissa hi pieces. Results in 
biased mantissa hi of I A', in R8; lo part is 
in R1 . ) 

R8=R8 .OR. X'8000' . (Unbias I A ! biased mantissa hi 

pieces. ! A ' true mantissa hi, mAu 
hi, is in R8; j A I true mantissa 
hi, mAu lo, is in R1.) 


Register: 1 

REI RDl RCl RBI RAl 

R9l R8S R7I R6! R5l 

R4| R3l R2 1 R1 1 

RO 5 

1 

I 1 1 1 1 

ImAuj 1 | 1 

1 I 1 mAu ,' 


Use : | 

SA 1 — 1 - 1 — 1 - 1 

- 1 hi! - 1 - I - I 

- 1 - 1 2EB | lo! 

• j 


(Develop ) A ! as an integer. Use the data derived from 
ANG (SA, mAu, and 2EB) to do this. The 2EB value will be 
used to access a shift constant from the SHF table; that 
constant will be used to shift the mantissa (mAu) bits 
to the left (relative to the mantissa's radix point); the 
resultant value is the integer form of |A|. 
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Initially, consider that aAu has been divided by 2**30; 
the radix point is then 30 bit positions to the left 
of the left edge of R8. In fact, if the 4 register 
combination, (RC, RA, R8, R1), contains this value, then 
RC and RA are conceptually filled with M 0 M, s and the 
radix point lies 2 bits in from the left edge of RC, i.e. , 
between RC(1) and RC(2). To get the true integer value of 
t A 1 , this value must be multiplied by 2**(EXA+30) where 
EXA is the true exponent of I A I and the ”30" compensates 
for the earlier division by 2**30. The quantity 
2**(EXA+30) is the value extracted by appropriately 
entering the SHF table.) 

(Develop SHF table entry address.) 

4.7.3.13 R2sR2-(( 128-30) *2) . (Two times (|A| true exponent+30) is 

put into R2 after removal of 
exponent bias; result is called 
2EA.) 

Register:! RE} RD| RCj RBj RA! R9l R8| R7l R6I R5 ! R4| R3l R2{ R1 I B0 ! 

I I I I I I ImAui I I I I I imAui ! 

Use:! SA! - ! - ! - I - I - ! hi! - ! - I - ! - I - !2EA{ loi - ! 


(Create output Y for different exponent ranges of ! A | . The 
true exponent value of }A} can’t be greater than -1.) 

4.7.3.15 If R2=negative 

(!A! true exponent is -31 or less. |Y|aPI/2.) 
4.7.3.15.0 R2sX , 40C9 l . (PI/2 (hi) for exponent -31 to -128.) 

R0=X'0FDB» . (PI/2 (lo).) 

R2=R2 .OR. RE . (Merge in sign bit of output.) 

REsO . (Set status.) 


Register: | R£| RDi RC i 

! ! ! I 

Use:! -!-!-! 


RETURN via RF . 

RB! RA! R9i R8! R7! R6 ! R5i R4 

I f I I ! I I 


R3i R2! Rl! R0 
II!! 

- lYhil - ! Ylo! 



Else 

Register:! RE! RD| RC| RB| RA I R9l R8| R7l R6| R5l R4| R3 ! R2 } Rl i R0 ! 

I I I I I I ImAui I I I ! I imAui ! 

Use:! SA| - | - ! - ! - | - I hi! - | - I - I - i - |2EA| lo| - I 

(The integer form of i A ) must be developed in order 

to execute the subtraction from PI/2; the mAu 
value must be shifted relative to the radix point by 
an amount determined by (Al's true exponent. The 
I A| exponent range is from -31 up through -1.) 

4.7.3- 15.1 If R2(10)a1 . 

(Data of (RC, RA, R8, Rl) must be shifted left 
by 16.) 

4.7.3- 15.1.0 RA*R8 
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4 , 7 . 3 . 15 , 1.1 


4,7.3.15.3 


R8«R1 

R 2 .AND. X'OOIE' . (Clear lead bit of 2EA. ) 

Else 

RAxO , (Initialize RA.) 

End (4.7.3.15.1) If 

(Do the remaining (less than 16) shift to convert 
t A ! to an integer.) 

R2sR2+SHF , (Develop address into table to get 
shift factor.) 

R5=0(R2) . (Put shift value into R5.) 

(RA, RB)sRA § R5 . (Do hi shift. The role of RC of 
the 4 register set, (RC,RA,R8,R1 ) , 
is now taken on by RA.) 

(R8, R9)*R8*R5 . (Do lo shift. The role of RA of 

the 4 register set, (RC,RA,R8,R1 ) , 
is now taken on by R8.) 

R8sR8 .OR. RB . (Merge bits of similar level of 
significance in R8.) 


Register;! RE > RD| RC! RB! RAj R$! R8| R7l R6 i R5! R4! R3! R2 ( Rli RO! 

! I I I !int| I inti I i ! ! ! ! I i 

Use:! SA| - I - ! - l!A|| - MAH - J - i - | - i - | - | - | - ! 

! I ! I ! hi! I lot I I I I I I i t 

((RA, R8) now contains the integer value of i A ! ; the 
radix point lies between RA(1) and RA(2). Now 
execute the (PI/2-(int !A|)) process.) 

4.7.3.15.5 R8*.NOT.R8 . (Negate int |Ai lo.) 

RAs.NOT.RA . (Negate int } A f hi.) 

R8xR8+X , PI/2' ,save carry out, (Add PI/2 lo.) 
RAxRA+X'PI/2' , carry in. (Add PI/2 hi.) 

(Convert int |Y! into floating point Y.) 

4.7,3.15.7 (R8, R9)=R8«X'0400' . (Align I Yi mantissa lo.) 

(RA, RB)xRA*X* 0400* . (Align I Y| mantissa hi.) 

R8+R8 .or. RB , (Merge mantissa pieces.) 

R0*R8 . (Move Ylo into RO.) 

R2xRA . (Move |Y! hi into R2.) 

R2sR2 .OR. X'4080' . (Merge in biased exponent of 

tY| corresponding to Y true exponent of 1.) 

R2sR2 .OR, RE , (Insert sign bit of Y.) 

RExO . (Set status.) 

RETURN via RF . 

Register:! RE] RD j RC | RB I RAi R9l R8| R7l R6| R5i R4| R3) R2 i R1 f RO i 

! I I i I ! I I ! ! I I I ! ! I 

Use:! -l-i-l-i-l-l-i-l-l-!-!- I Yhi I - I Ylo! 
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End (4,7.3*15) If 
End (4,7.3) If 
End (4,7) If 
End (4) If 
END 


- 3-66 - 



HPP SCIENTIFIC SUBROUTINES 


GOODYEAR AEROSPACE 
CORPORATION 
OER-17221 


3.2.5 MCU NATURAL LOG SUBROUTINE DESCRIPTION : LNM 


This Subroutine develops the value, "Y", the natural logarithm of the input 
variable, "X". "X", the input, and "Y" , the output, are 32 bit VAX floating 
point numbers. Along with "Y", a 16 bit status value, S, is generated for 
output; it is 0 when X is positive and non-zero. The theory used for this MCU 
computation if the natural logarithm, is identical to that for the parallel 
array algorithm. 


The subroutine demands the calculation of : 

YaLN(X) 

0. (ENTER LOGM.) 

1. (Set the status bits to 0.) 

Load R1 4 with X'OOOO'. 

2. (Check for Xanegative.) 

If bit 0 of R11 is set 
then, 

Load register R2 with X'0003'. 

RETURN. Fatal error status if X was negative. 

End If. 

Else, 

Continue. 
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3. ( Cheok fop XaO.) 

If R9«0, 

If R1 1«0, 

Load register R2 with X’FFFF* . 

Load raglatar RO with X'FFFF' . 

Load raglatar R14 with *2* to Indioata underflow. 
RETURN. 

Else, 

Continue. 


Elae 


Continue. 


4. R9 and R11 oontain X. Adjust the R11 X bits 9 bit positions to the left so 
that the shifted R11 X bits lie in R12 and R1 1 as follows: 


I 111111 

10123456789012345 
I I 

Address R11: i a| ml a) al al a| a| 02 0! 01 Ol 02 02 0| 0| 0 

1 I I I I I I ! I I I I I I I I 

2 12 2] 3! 42 52 61 71 2 2 2 2 I I I 2 

2 I 2 2 I 2 I I I 2 2 2 2 2 2 I 


2 111111 
10123456789012345 
2 I 

Address R12: | 01 Oi Ol 02 Ol Ol Oi SI ei el e| el el el e| e 

112 12 11112 12 12)1 

1 2 2 I I I I 2 2 01 12 21 31 41 52 61 7 

2 2 I I I I I I 2 I I 2 2 2 I 2 
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Perform oardinal aultiply : 

MLU R11,#X»200* . 

5. Adjust the R9 X bite 9 bit positions to the left so that the shifted R9 
X bits lie in RIO and R9 as follows: 


Address RIO: 


Address R9: 


0 1 


1 1 1 1 1 11 
234567890 123^51 


al al al al al ai a) 0| 01 Ol 0| 0| Ol Oi 0| Oi 

1| 1| II 21 2| 2| 2| | I | | | | | | I 

71 81 91 01 It 21 31 I I I I I I I I I 

I I I I I I I I I I I I I I I I 


1 1 1 1 1 II 

0 1 234567890 1 23451 

I I 

01 Ol Oi 0| Ol Ol Oi ol ml a| at al ai a| al al 

I I I I I I I ! I 11 11 II 11 II II 11 

I I I I I I I 81 91 01 II 21 31 5$ 61 

I I I I I I I I I I I I I I I I 


Perform cardinal aultiply : 

MLU R9,#X*200* 
6. "OR" R9 with R12 to get: 


Address R9: 


0 1 

a| a 


1 1 1 1 1 11 
2345678 9 01234 51 


I a| a| al al al al ai al a| al a| al al 

I I I I I I I I I II II II II 1! II II 

11 21 31 4| 51 61 71 81 91 0! 1| 2| 31 4| 51 61 

I I I I I I I I I I I I I I I I 


7* IP the aoet significant aantlasa bit is set, 
then, use LNC0EFF1 coefficients; 

else, use LNCOEPFO coefficients; 
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In general the polynomial ooeffieients are: 


HEX address 

Coefficient 

COEF+ 0 

A(0) 

(low) 

C0EF+ 1 

A(0) 

(high) 

C0EF+ 2 

A( 1) 

(low) 

C0EF+ 3 

e 

A( 1 ) 

e 

(high) 

• 

e 

C0EF+2*N 

• 

A(N) 

(low) 

C0EF+2*N+ 1 

A(N) 

(high) 


Note that POLY computes: 

HsA(0)+fb*(A( 1)+fb*(A(2)+fb*(A(3)+fb*(A(4)+. . ,+fb»(A(N-1 )+fb*(A(N) ))...).- 
The steps to use POLY follcw.) 

Load R8 with COEF block starting addres"- 
Load R7 with the degree of the polynomial, N. 


CALL R15,PLY32$ to compute LN(ARG)-ARG. 


8. Then, because the desired function is actually LN(ARG), 
add in the input ARG contained in RIO and R9: 

ADD R0,R10 

ADDC R2,R9 . 

We thus have computed the LN(1+U) term of the required: 

Y = EXP*LN(2) - ( 128+1 )*LN(2) + LN(1+U) - U . 

9, Sutract 128+1 from the input exponent: 

SUB R11, #128+1 for use in above equation; 
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If R11 is negative, 

then, Perform 2‘s complement of R11 
and remember sign bit in R14 . 
else, continue. 

10. Save the exponent of R11 in R6. Perform the multiplication of the 
exponent by the constant LN(2): 



MLU 

R11,#X'B172» 

which 

is 

the high half of LN(2) 

and, 

MLU 

Re.tx'nFg* 

which 

is 

the low half. 


Merge the two multiplication halves: 

ADD R6,R12 
ADDC R1 1 

11. If EXP-129 was negative, the product is negative, 

then, complement R6, R7, and R11 which contain (EXP-129) # LN(2) . 

12. Add the Polynomial results to R6, and R7 : 

ADD R7,R0 

ADDC R6,R2 

INCRC R 1 1 

13. If this final result is negative, 

then, remember the final sign bit in R14, 

and complement R11, R6, and R7 to get a positive number. 

Now the un-normalized Y value is contained in registers 
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R11, R6 and R7. 

14. Normalize the Y value as follows: 

If R11 is 'O', 

then, R6 and R7 contain the fractional bits; 

CLR RIO no exponent bias required; 

CALL R15,N0RMV$ to normalize Y. 

The output of N0RMV$ is VAX 32-bit format in R2 and RO. 
CLR R14 clear status and, 

RETURN . 

else, 

continue. 


15. Normalize the 48 possible bits in R11, R6 and R7: 

Since the maximum number of exponent bits is 7, 

Shift the value in R11 by 9 places (16 bit register -7) in order 
to speed the normalization: 

MLU R11,#X'200» 

LR R8,R7 save R7 temporarily; 

MLU R6,#X*200» 

MLU R8,#X'200» . 
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Then merge the results: 

OR R6 ,R12 
OR R7,R8 . 

Save the shift count In RIO to be passed to N0RMV$ 

so that the exponent may be adjusted; Include also the 16 bits of R11: 

LR RIO, #16-9 . 

For the normalize routine N0RMV$ , the Input registers are R3 and R1 : 

LR R3.R6 

LR R1 ,R7 

CALL R15,N0RMV$ . 

Clear the status register: 

CLR R14 and, 

RETURN . 
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3.3.6 MCU EXP SUBROUTINE DESCRIPTION : EXPM 


For the Exponential subroutine, X is the input variable, Y is the output 
variable, and S is the output status indicator. 

All 16 bits of S are normally 0. When S is 1 , overflow has occurred; S set to 
2 indicates underflow, a non fatal condition. 

Y is set at X'7FFFFFFF' if Y goes out of range (overflows); Y is set at 
X» 00000000' if X underflows. 

The algorithm used for this subroutine is identical to the algorithm used for 
the array Logarithm function (LNA) described Section 2. 


0. (ENTER EXPM.) 

1 . Set the status bits to 0 . 

Load R14 with X'0000'. 

2. (Check for X=0.) 

If R9=0, 

If R1 1s0 , 

Load register R2 with X’0000'. 

Load register R0 with X'4080' , a VAX *1'. 
RETURN. 

Else, . 

Continue. 
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3. R11 and R9 oontain X. Adjust the X bits 9 bit pos ions to the left so 
that the shifted X bits lie in R9 and RIO as folic a. 


j 1111111 

1012345678901234 51 

i I I 

R11: j ml mi m{ mi mi ml ml 0| Oi 0| 0} 0| 0| 0) 0| Oi 

I I I I I I I i i i I i I I I I I 

I II 21 31 4| 51 61 71 I I I I I I I I I 

I I I I I I I I I I I I I I I I I 


| 1 1 1 1 1 1i 

1012345678901234 51 

I I I 

R12: I 0| 01 01 01 OS 0| 0| S| ej el e| e$ e| el e| el 

!(!!!! f I !!!!!!! 1 I 



Perform cardinal multiply : 


MLU R1 1 ,#X’200' • 


4. Adjust the 

R9 

X 

bits 

9 

bit positions 

to 

the 

left 

so 

that 

the shifted R9 

X bits lie 

in 

RIO and R9 

as 

follows 

: 










1 

1 











1 

1 

1 

1 

1 

1 

11 


1 

i 

0 

1 

2 

3 

4 

5 

6 

7 

; 

8 

9 

0 

1 

2 

3 

4 

51 

j 

RIO 

1 

: I 

mi 

ml 

ml 

mi 

ml 

ml 

ml 

I 

oi 

01 

01 

01 

01 

OS 

01 

01 

01 


l 

11 

11 

11 

21 

21 

21 

21 

I 

l 

l 

1 

1 

1 

1 

I 

i 


i 

1 

71 

1 

81 

1 

91 

1 

01 

l 

11 

1 

21 

I 

31 

I 

1 

I 

1 

l 

1 

1 

I 

1 

1 

1 

1 

I 

1 

1 

1 

1 

1 

I 


1 

l 











1 

1 

1 

1 

1 

1 

11 


l 

1 

0 

1 

2 

3 

4 

5 

6 

7 

8 

9 

0 

1 

2 

3 

4 

51 

R9: 

1 

l 

Oi 

oi 

01 

01 

01 

01 

01 

m| 

ml 

ml 

mi 

ml 

ml 

mi 

ml 

i 

ml 


I 

1 

i 

1 

1 

1 

1 

1 


1 

11 

11 

11 

11 

11 

11 

i! 


1 

1 

l 

l 

1 

i 

1 

1 

1 

1 

1 

I 

1 

1 

1 

l 

81 

91 

1 

OS 

l 

11 

l 

21 

1 

31 

1 

41 

1 

51 

l 

61 

1 
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Perform oardlnal multiply : 

MLU R9,#X'200' 

5. "OR" R9 with R12 to get: 












1 

1 

1 

1 

1 

11 

0 

1 

2 

3 

4 

5 

6 

7 

l 

8 

9 

0 

1 

2 

3 

4 

51 

1 

ml 

ml 

mi 

mi 

ml 

ml 

ml 

ml 

ml 

mi 

ml 

mi 

ml 

m{ 

mi 

mi 

1 

l 

1 

1 

1 

1 

1 

1 

1 

11 

11 

11 

11 

11 

11 

11 

11 

2| 

31 

41 

51 

61 

71 

61 

91 

Cl 

11 

21 

31 

41 

51 

61 

1 

l 

1 

I 

1 

1 

I 

1 

1 

I 

! 

1 

1 

1 

I 

1 


6. Load register R2 with aHIsX , 2E2A' where 'a' is 1/LN(2). 
Load register RO with iLO=X‘8EC9'. 

Then call the common multiply subroutine to obtain a*f: 


CALL R15,MULT32$. 

The results are stored in RO and R2. 

7. Add in the a/2 term: 

ADD RO ,#X'8ECA' 

ADDC R2,#X , 2E2A' . 

These registers are the Ig bits of the final result. 

8. Save the sign bit in R4, and Clear the sign bit of the exponent in R11. 


9. If the integer part of Ig is a 'O', then the input exponent will 
determine overflow if greater than 8. If the integer part of Ig 
is ' 1 ' the overflow or underflow exists if the input exponent is 
greater than 6: 


If bit #1 of R2 is 0, (Ig integer bit) 
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then, 

If R11 (Input exponent) >s 128+8 (Include bias), 
then, 

Perform step 10 error return. 

else, 

If R11 (input exponent) < 128+7 (include bias), 

then, 

Perform step 10 error return. 

else proceed to step 1 1 for normal execution 
no overflow or underflow exists. 

10. When the input exponent becomes out of range: 

If the sign bit of the exponent was positive, 
then, 

An overflow has occurred so 

Set the output to the maximun VAX number 

And set the status to ' 1 ' . 

LR R0,#X»FFFF 
LR R2,#X'7FFF' 

LR R14,#1 
RETURN. 

else, 

since the exponent was large and the sign negative, 
a non-fatal underflow condition exists; the output 
becomes 'O' and an underflow status is indicated: 
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CLR RO 
CLR R2 
LR Rl4,#2 
RETURN. 

11. (NORMAL entry) 

Compare R11 (the input exponent) with 128-31; 

If R11 less than 2**-31 
then, 

force the output to • 1 * (since e**0 * 1). 

else, 

oontinue. 

12. Move the Ig values to R6 and R7. 


13. (The registers, (R6, R7) save the positive sum Ig=a # fs(lgl+Ig2+Ig3)+(a/2) 
with the form (0.1.30). The value of Ig must be shifted an amount determined 
by the pure biased exponent value of X stored in RIO. But before proceeding 
with the shift, multiply Ig by 2**(-34). The scaled positive Ig value is 
assumed to reside in the oonoatenation of registers, RIO, R9 t R8, R7, R6. 

The bits of registers RIO, R9, R8 must be loaded with the sign bit of 0. 

The radix point of the scaled Ig value lies at the boundary between RIO and 
R9 (l.e., between R10(15) and R9(0)). Now load the sign into the registers.) 

Load R8 with 0, 

Load R9 with 0. 

Load R10 with 0. 

(R10 will store the integer bits of the shifted Ig value.) 


14. (The registers, (R6, R7) save the sum Ig*a*f»(Ig1+Ig2+Ig3)+(a/2) with 
the form (0.1.30). The bits of R10, R9, R8, R7, R6 must be shifted left 
an amount that is ultimately determined by the pure biased exponent value 
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of X stored in R11. Bsfore proceeding, create the shift oonstant by 
subtrsotlng 128 and adding 34 to the value stored in R11. 

Add R11 to the negative of the exponent bias (-128) plus 34; store result 
into R11. (R11 now contains the left shift value; it can't be bigger than 
41 beoause of the operations of item 12.) 


15. (Left shift the bits of RIO, R9, R8 f R7, R6 by the anount stored in R11. 

The radix point between RIO and R9 remains fixed during the shift operations. 
The shift is performed in 3 phases: shift by 32, shift by 16, and shift 
by less than 16. The steps follow.) 

(Check MSB of R1 1 exponent.) 

If bit #10 of R11 « 0, 

Continue, (MSB of R11 exponent is not set.) 

Else 

Load R9 with R7. (MSB of R11 exponent is set.) 

Load R8 with R6. (Shift 32 bits). 

Load R7 with 0. 

Load R6 with 0. 

End If. 


(Check next MSB of exponent). 

If bit #11 of R11 exponent is 0, 

Continue. (Next MSB of R3 is not set.) 

Else 

Load R9 with R8, (Next MSB of exponent is set.) 
Load R8 with R7 
Load R7 with R6. 

Load R6 with 0. 
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End If. 


"AND" X'OOOF' with R11; store result in R11. (Lowest 4 bits of shift value.) 

Note: The SHFM data table oontalns 16 bit storage looations. 

The starting address of this set of 16 values is SHF$. The values are: 


Address 

Hex Value 

Address 

Hex Value 

SHIFT* 0 

0001 

SHIFT* 8 

0100 

SHIFT* 1 

0002 

SHIFT* 9 

0200 

SHIFT* 2 

0004 

SHIFT* 10 

0400 

SHIFT* 3 

0008 

SHIFT* 1 1 

0800 

SHIFT* 4 

0010 

SHI FT* 12 

1000 

SHIFT* 5 

0020 

SHIFT* 13 

2000 

SHIFT* 6 

0040 

SHIFT* 14 

4000 

SHIFT* 7 

0080 

SHIFT* 15 

8000 


The shifts will be accomplished with multiplies using the factors from 
the SHIFT block.) 

Load RO with value in SHIFT(R1 1*2) . 

Cardinal multiply R9 times R11; store result low in R9 and result high in 
R11. 

Cardinal multiply R8 times R11; store result low in R11 and result 
high in R12. 

"OR" R12 with R9» store result in R9. 

Cardinal multiply R7 times R12; store result low in R7 and result high in 
R8. 

"OR" R11 with R8; store result in R8. 


l6.(The RIO, R9, R8 registers hold the shifted positive sum Igsa*f 

*(Ig1*Ig2*Ig3)*(a/2) value which has the form (0.8.32). The radix point 
remains between RIO and R9. But before proceeding, make the shifted Ig value 
a 2's complement number.) 

If R4«0, 

Continue. 
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Complement RIO. (RIO stores the integer bits of the signed shifted 
Ig value.) 

Coapleaent R9. 

Complement R8. 


End If. 


l6*(The RIO, R9, R8 registers hold the signed, shifted positive sum Ig*a ft f 
«(Ig1+Ig2+Ig3)+(a/2) value which has the fora (8.8.32). The radix point 
remains between RIO and R9. To generate the biased exponent of the output 
Y value, add the bias of 128 plus 1 to the RIO value.) 

Add 128+1 to RIO. 

17. (The R9, R8 registers hold the signed, shifted positive fraction of the 
sun Ig*a A f«(Ig1+Ig2+Ig3)+(a/2) value which has the form (0.0.32), The radix 
point remains between RIO and R9. RIO contains the biased exponent of Y. 
Check to see that the exponent is still in bound.) 

Load R12 with X'FFOO’. 

"AND" R12 with RIO; store results into R12. 

If R12.0 

Continue. 

Else 

Load R14 status with X’OOOi'. 

Load R2 with X'7FFF‘. 

Load R0 with X'FFFF' . 

RETURN. 


End If. 


l8.(The R9, R8 registers hold the signed, shifted positive fraction of the 
sum Ig«a*f«(Ig1+Ig2+Ig3)+(a/2) value which has the fora (0.0.32). The radix 
point remains between RIO and R9. RIO contains an in range biased exponent of 
Y. The fraction in (R9, R8), called ff, determines the value of the function, 
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H*(2 ## (ff))/2)-.75 . Th* rang* of ff in HEX is X' 00000000 **2«*(-32) to 

X 1 FFFFFFFF' # 2 ## ( -32 ) . Bias ff by .5 to cr*at* fb. Th*n 
H.(2*«(fb+.5)))/2)-.75 and th* rang* of fb In HEX la X' 80000000 •»2**( -32) 
to X’7FFFFFFF*»2”(-32) , i.*., -.5 <« fb <<*.5 . (Th* rang* of H in HEX is 

-.25 <« H <+.25 ; its form ia (1.0. 3D.). Proo**d to biaa ff by .5 .) 


19. (Call P0LY32(R9, RIO, R8, R5, R4, R3, R2). Th* input fraotion fb ia put 
into R9> RIO. Th* oo«ffloi*nta of th* polynomial that fita H ar* in the 
COEF block of th* MCU memory. R8 holda the starting addreaa of th* COEF 
blook. The degree of the polynoaial to approximate, N, la held by R7. 
Eaoh ooeffiol*nt has th* fora (1.0. 3D. The ooefflolenta ar* stored 
in the COEF blook in aaoending order. In partioular, 


HEX address 

Coefficient 

C0EF+ 0 

A(0) 

(low) 

C0EF+ 1 

A(0) 

(high) 

C0EF+ 2 

A( 1) 

(low) 

C0EF+ 3 
• 

A( 1 ) 

e 

(high) 

* 

C0EF+2«N 

e 

A(N) 

(low) 

COEF+2*N+1 

A(N) 

(high) 


Finally, after execution of POLY, H with form (1.0.31) is returned in 
R2, R0. Note that POLY computes 

H*A(0)+fb # (A( 1)+fb # (A( 2)+fb*(A(3)+fb # (A(4)+. . .+fb*(A(N-1 )+fb B (A(N) ))...). 
The steps to use POLY follow.) 

Load R8 with COEF block start 4 .ig address. 

Load R7 with the degree of the polynoaial, N. 

CALL R15,POLY32$ 

R2 and R0 contain the output of POLY32, R11 contains the exponent, 
coapleaent the sign bit of R2 (this is the hidden fraction bit). 

20. (Paok data.) 

Cardinal aultiply R11 exponent with X'0080' ; store into R11, R12. 
(Ignore hi part that fills H1 1 . The Y sign bit and biased exponent 
are now properly positioned in R12.) 

Cardinal aultiply R2 with X'0100'; store into R2, R3. (The Y mantissa, 
biased by .5, now is positioned oorrectly for merging with the sign 
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«<nd axponant data.) 

"AND" R2 with X'007F‘ ; atora into R2. 

"OR" R2 with R12; atora into R2. 

Cardinal aultiply RO with X'0100'; atora into RO. 
"OR" RO with R3; atora into R3. 


RETURN 
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3.2.7 MCU COMMON SUBROUTINE : POLY32 


This subroutine develops the value, "P B , of a polynomial of degree B N" using 
the independent variable "U B as the input. "U", the input, and B P B , the 
output, are 32 bit 2's complement numbers. The value "P" is given by 

PsA0*(U**0)+A1*(U**1)+A2*(U**2)+A3*(U**3)+ +AN*(U»»N) . 

B P B is computed from right to left using 

P*A0+U* ( A UU» ( A2+U* ( A3+U* ( A4+U« ( A5+U* ( A6+ ♦U*(AN) . 

The routine assumes that -1/4 <s U < 1/4 and B U B has the signed magnitude 
format [SU, (0.31 .0) ]»2«»(-32) and that -1/4 <= P < 1/4 and "P" has the 

2's complement format ( 1.31.0) f 2**(-32) . 

Whatever the starting location of the memory space that stores the "A" 
coefficients, they are assumed stored in the sequence: 


Address 

Item 

"A*+ 

0 

A0(hi) 

"A"+ 

2 

A0(lo) 

"A"+ 

4 

AUhi) 

"A"+ 

6 

A1 (lo) 

"A"+ 

8 

A2(hi) 

"A"+ 

10 

2(lo) 

"A"+ 

12 

A3(hi) 

n A «+ 

14 

A3(lo) 


• • 

"A"+4»N AN(lo) 

"A"+4»N+2 A2(hi) ; 

the "A” coefficients are assumed to have the same form as B P B . The M A" 
block is stored outside of this subroutine. 

The routine is given the location of the last 2 byte word of the "A" 
coefficient block in R8. The degree of the polynomial is given in R7. 

The degree of the polynomial is at least 1 . 


The entry branch and link register for this subroutine is RF. This subroutine, 
in turn, calls the subroutine, MULT32, as an internal subroutine (i.e., no 
BAL register is used) . 
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Registers directly required by the POLY32 routine are marked with a w # w below. 
Registers indirectly required by the MULT32 routine are marked with a "$ w 
below. 

Register: t RE i RD| RC| RBj RAj R9i R8i R7l R6| R5i R4| R3i R2i R1 I ROi 

J I } I i i I I I I I i I I I I 

POLY32 Use: { |*| I i#i#l#l#| I 1 i i#| J # } 

MULT32 Use: I i * i i I $ I * I t t l I $ I $ i $ I $ i $ I 


ON ENTRY: 

R7=N 

R8=COEF[ function] +4»N+2 

R9=Uhi 

RAsUlo 

RDsSU (the sign of U exists in the most significant bit location; all 
remaining bits are "0".) 


ON EXIT: 
ROsPlo 

( R2*Phi 


1. POLY32 entry . 

Register:! RE I RD{ RCi RBi RA| R9i R8! R7I R6$ R5i R4| R3i R2} R1 i R0| 

I I I 1 I ! I ! ! i ! ! ! ! I ! 

ill! iUhiSulollocicntj i s i i i i i 


2. R0=0(R8); R8=R8-2 . (Load R0 with ANlo; decrement the address register.) 


Register:! RE| RDi RC| RBi RAJ R9! R8| R7! R6| R5! R4{ R3 


Use: !SAV{ SUS 


!Uhi!Ulo{loc!cnt! 

I I I -2! 1 


R2 


Rli R0| 


AN! 


3. R2=0(K8); R8=R8-2 . (Load R2 with ANhi; decrement the address register.) 

Register:) RE) RDi RC| RBi RAj R9i R8J R7l R6 I R5i R4! R3i R2| Rli ROi 

I I I I I I I ! I I I I I i I I 

Use: !SAV| SUi ! |Uhi!Ulo{loc|cnt | i | i ! AN l | AN! 

! I ! I I I I -2! I I I i ! hi! ! loi 



4. Bal(RF) to MULT32.MS . (V is the 2's complement multiplier result.) 
Register:! RE| RDi RCi RBi RA] R9i R81 R7i R6i R5i R4i R3i R2i R1 | ROi 

I I I I I i I i I I l I I I I l 

Use: iSAVi SUi i iUhilUloilocicntibali ( I iVhii iVlo! 
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5. R0sRO+ 0(R8) ; R8sR8-2 . (No carry-in; save carry-out. 


Register: 

Use: 


REi RD { RC 

I I 

SAV| SUt 


RBI RA| R9I R8| R7i R6 

I I I I I 

lUhilUlollocIcnt J 

I I I -21 I 


6. R2=R2+0(R8) ; R8=R8-2 . (Carry-in.) 


Register: 
Use: 


RE| RD| RC 

I I 

SAV| SUl 


RB| RA| R9l R8| R7l R6 


lUhilUlollocIcnt 

i I I -2| 


R5 


R5 


R4 


R31 

R2 1 

R1 | RO | 

1 

I 

1 I 

IVhii 

|Vlo| 

I 

1 

1 ♦ 1 

1 

1 

1 

I 

lAlol 

< i 

1 

1 

» 

1 

i 1 
lAlol 

R31 

R2| 

R1 1 RO I 


IVhii 
I 4 l 
lAhil 
I I 
lAhii 


I Alo 


7. R?sR7-1 . (Decrement count register.) 

Register:! RE I RD| RC| RB| RAl R9l R8| R7l R6| R5| R4| R3I R2| HI I R0| 

I I I I I I I I I I I I I I 

Use: ISA Vi SU| I lUhilUlollocIcnt I - I I I |Ahil lAlol 

I I I I I I I I -2| | | I | I I I 


8, IF R7=0 

RETURN via RF register. 

Register:! RE| RD I RC| RB| RA| R9l R8| R7l R6S R5l R4| R3l R2| HI | RO | 

I I I I I I I I I I I I I I I I 

Use: |SAV| SU| I I Uhl I Ulo |loc lent I - | | | |Ahil lAlol 


Else 

Go to 4, 


End If. 


END 
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3.2.8 MCU COMMON SUBROUTINE : MULT 32 


This subroutine multiplies a 2's complement 32 bit R A" times a 32 bit signed 
magnitude number, "U", to produce a 2‘s complement 32 bit "V". The low 32 bits 
of the product are dropped; the high 32 bits form "V. 


Registers directly required by the MULT32 ’outine are marked with a 
below. 

Register:! RE| RDj RC{ RB| RA| R9l R8| R7l R6| R5J R4J R3 ! R2| R1 ! ROj 

I I I I I I I I I I I I I I I I 

MULT32 Use:! ) $ i I 1 * 1*1 j } | I $ I * I * I $ I $ i 


ON ENTRY: 

ROsAlo 

R2=Ahi 

R9=Uhi 

RAsUlo 

RDsSU (the sign of U) 

ON EXIT: 

ROsVlo 

R2=Vhi 

1. MULT32 entry • 

Register:! RE I RD} RC! RB{ RA| R9 ! R8 I R7! R6{ R5! R4{ R3! R2! R1 { RO! 

I I I I I I ! ! I ! ! ! ! ! ! ! 

Use:} ! SU! ! !uio!Uhi! I I ! | | !AhiJ jAlo! 


2. R4sX , 8000 r . (Generate magnitude of "A" input; sign in R4.) 

Register:! RE! RD! RC{ RBj RA! R9 ! R8 ! R7 I R6! R5! R4! R3! R2! Rl! RO i 

t I I I I I i I I i ! I I ! ) ! 

Use:! I SU! ! JUlolUhi* ) { \ |Usel !Ahi! jAlol 


R4sR4.AND.R2 • (Sign of B A W ends up in R4(0), 0 in remaining bits.) 
Register:! RE! “ RCJ RB| RAJ R9 I R8! R7l R6{ R5! R4{ R3 | R2} Rl I RO } 
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3. IF R4s0 

Continue . 

Register: i REl RD| RC| RBi RA| R9l R8| R7l R6I R5l R4| R3i R2| R1 I R0| 

t I t I I I ! I i ! i t i I I I 

Use:) i SU| } lUlolUhii I I I I SA| iAhil |Alo| 


Else 

ROs.NOT.RO . 

R2..N0T.R2 . 

Register:! REl RD| RCf RBi RAl R9l R8| R7i R6| R5i R4 } R3i R2| R1| ROl 

I ! I i I I ! 1 I I I I I ! I I 

Use:! i sui | lUlolUhii i i ! ! SA| IAhil }Alo| 

End If. 

4. R4sR4. EXCLUSIVE OR.RD . (Sign of "V", the product output.) 

Register:! RE} RDl RC| RB| RAl R9i R8I R7l R6| R5l R4| R3I R2{ R1 I ROi 

I I I I I I ! I > i I I I I I I 

Use:} I SUI I lUlolUhii I } I I SV| |Ahij |Alol 

I ! I I I I I I I I I I Imagi I mag I 


5. (RO, R1)sR0»R9 . (Cardinal multiply; Alo*Uhi.) 


Register:! RE| RD 

i i i 

RC| 

i 

RB| RA| R9I 

I i i 

R8J 

R7I 

1 

R6| 

i 

R5I 

I 

R4 | 

l 

R3I R2| R1 | R0| 

till 

i 

Use: } 
1 
1 

1 ! 
I sui 
1 1 
! 1 

I 

1 

1 

1 

f 1 i 

lUlolUhii 
1 1 1 

1 I 1 

I 

I 

1 

1 

i 

i 

i 

i 

I 

1 

1 

1 

I 

1 

1 

1 

i 

SV| 

1 

I 

iiii 

iAhil AlolAlol 
1 mag I magi mag I 
1 1 » I • 1 

} 

1 

1 I 

i i 

1 

i 

I 1 1 

1 

1 

i 

i 

1 

l 

1 

i 

1 

1 

1 lUhilUhil 

I 

1 

1 

I 1 

I 1 
i I 

i 

1 

1 

I i i 

I 1 1 

1 i I 

i 

1 

1 

i 

{ 

i 

1 

1 

1 

1 

1 

1 

1 

1 

• ill 
1 Imagi mag 1 
I I lo*. hi} 


6. R1sR2 . 


Register: I 

i 

REl RDl 

t i 

RCI 

i 

RB| RAl R9i 

R8| 

i 

R7I 

i 

R6I 

1 

R5l 

i 

R4| 

l 

R3i R2| R1 | R0| 
<111 

I 

Use: j 

1 i 

! sui 

f 

i 

1 1 i 

lUlolUhii 

1 

1 

I 

1 

1 

I 

1 

J 

SV| 

IIII 
1 Ahi IAhil Alo) 

1 

1 1 

1 

I 1 I 

1 

1 

I 

1 

1 

I mag Imagi magi 

1 

I I 

1 

1 1 i 

1 

I 

1 

1 

1 

I 1 1*1 

1 

1 

I I 

i i 

1 

1 

I 1 I 

t t i 

1 

1 

1 

i 

1 

i 

1 

i 

1 

i 

1 1 lUhil 

iiii 

1 

I 

t i 

! 1 

1 

1 

1 1 1 

I I i 

i 

1 

1 

1 

1 

1 

1 

1 

1 

1 

I 1 I I 

I 1 Imagi 

1 

1 I 

1 

I 1 I 

I 

1 

I 

1 

1 

1 I 1 hi! 
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7. (R3, R2)«R2«R9 


Register: 


Use: 


RE 


RD 


SU 


(Cardinal multiply; Ahi*Uhi.) 


RC 


RB 


RA| R9 

I 

UlolUhi 


R8 


R7 


R6 


R5 


R4| R3i R2I Rli ROi 
lilt! 
SV | Ahi | Ahl | Ahi | Alo I 
I mag | magi mag} magi 

I • I * I I • i 
I Uhl i Uhl I iUhii 


I mag | magi 
I lot hil 


I magi 
I hit 


8. R3*R3+R0 . (No carry-in; preserve carry-out.) "P"a }A|hi*Uhi+| A|lo # Uhi 

R2*R2+0 . (Carry-in.) 

Register:! RE I RD} RC i RB| RAj R9i R8i R7i R6i R5| R4$ R3 I R2 i R1| R0| 

I I i I I i I I I I I I I i i I 

Use:} i SUi i }Ulo|Uhi| i I I I SV I Plot Phi I Ahi I - i 

I I I I I I I I I i 1 II imagi i 


9. R0=R1 . (Put |A!hi into RO.) 

Register:} RE} RD} RC} RB} RA} R9i R8! R7i R6i R5i R4 } R3 I R2l Rli RO} 

i ! i I ! ! } I ! ! i ! } ! I ! 

Use:} i SUI I iUloiUhi! ! } ! I SVl Plot Phil - I Ahl} 

I I ! } I I ! i I ! I ! I I Imagi 


10. (RO, R1)sR0*RA • (Cardinal multiply; Ahi*Ulo.) 


Register:} RE} RD| 

i > ) 

RC | 

i 

RB} RAi R9I 
1 i 1 

R8I 

i 

R7 } 
1 

R6I 

1 

R5I 

i 

R4 } R3I R2| Rli R0| 

liiii 

Use:} 

1 

1 

i SUi 
1 ! 
! i 

1 

1 

1 

1 

Il’iolUhil 
1 i 1 

i 1 I 

r 

1 

1 

1 

i 

1 

1 

} 

i 

1 

1 

} 

1 

1 

1 

1 

r i r i i 

SV } Plo } Phi } Ahi } Ahi I 
I } | mag } mag } 

1 I I • 1 • 1 

} 

1 

i ! 

t i 

1 

• 

i i I 

i i l 

! 

i 

1 

i 

1 

1 

» 

i 

} } lUlolUlol 

liiii 

t 

1 

1 

r r 

1 1 

1 I 

1 

I 

1 

r l t 

> i 1 

i i i 

i 

1 

1 

I 

1 

} 

i 

} 

1 

1 

» 

1 

trill 
} } Imagi magi 

1 1 1 lol hil 


- 3-89 - 


MPP SCIENTIFIC SUBROUTINES GOODYEAR AEROSPACE 

CORPORATION 
GER- 17221 


11, RO*RO+R3 . (No carry-in; preserve carry-out.) i V 1 *P+I A|hi*Ulo , 
R2=R2+0 . (Carry-in.) 

Register:} RE| RD| RC| RB| RA| R9i R8| R7l R6t R5 I R4| R3 I R2l R1 I R0| 

I i I I I I I I I I I I I I I I 

Use:! ! SU{ ! iUlolUhil ! ! } ! SV| - !vhii - ! viol 

i ! ! ! I i ! ! ! ! ! ! laagi !nag! 


12. IF R4=0 

Continue • (Product sign bit is 0.) 

Register:! RE! RD | RC! RBI RA| R9 I R8 i R7i R6I R5t R4 } R3i R2| R1 } ROi 

i i ! ! I I I I ! I ! I I ! > i 

Use:! ! SU! ! }Ulo}Uhi! ! ! ! I - ! - ! Vhi! - }Vlo! 


Else 

ROs.NOT.RO . 

R2s.N0T.R0 . 

Register:! RE I RDi RC! RB) RA) R9i R8} R7l R6| R5i R4 | R3i R2| R1| RO! 

I ! I I I I I I I I I I I I ! I 

Use:! ! SU! { iUlolUhi) } ! ! ! - ! - {Vhi! - I Viol 


End If. 


13. RETURN . 

Register:! RE I RD I RC| RB) RA I R9l R8| R7l R6| R5l R4 1 R3i R2 1 R1 { ROi 

I !!!!!!! ! I I I I I ! I 

Use:) ) SU) I !Ulo|Uhi{ | ) } } - ! - I Vhi* - )Vlo| 


END 
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3.2.9 MCU COMMON SUBROUTINE : NORMV 


This subroutine takes 32 bits of fractional data input contained in two 16-bit 
halves and performs a floating point normalization. A VAX 32-bit format 
floating point number with sign and biased exponent becomes the resultant 
output. 


Input registers: 

R3 contains the High half of the input fraction. 

R1 contains the Low half of the input fraction. 

RIO contains a 2's complement scale factor (known bias) 
or 'O' if none. 

R14 contains the sign bit of the result; given as 
either X'8000» or '0 f . 

Output registers: 


R2 High Half of VAX aligned, normalized input (sign and exponent). 
RO Low half of Vax aligned, normalized input. 


0. ENTRY NORMV. 

1. If the most significant half of the input is ’O' 

then, the input is less than 2**-l6. 

If the least significant half of the input is 'O', 
then, the output must be true VAX 'O'. 


CLR RO 
CLR R2 
RETURN. 


else, swap the input halves and adjust the exponent of RIO 
by 128-16 (bias plus a shift of 16). 


- 3-91 


MPP SCIENTIFIC SUBROUTINES 


GOODYEAR AEROSPACE 
CORPORATION 
GER-17221 


else, 


some 


LR R3,R1 
CLR R1 

ADD RIO, #128-16. 

bits are set in the high half; add bias to exponent. 
ADD RIO, #128 


2. Check the high byte of the input fraction for any bits set. Note R12 
oounts the number of shifts required to obtain blt(O) set in the 
input fraction; initialize R12 to 0. 

CLR R12 

LR R5,#X»FF00» (mask for high byte). 

AND R5,R3 


If no bits are set in the high half, a shift of 8 may be performed by 
a multiplication, and the exponent adjusted by 8; 


MLU R3,#X»100» 

MLU R 1 , #X ' 1 00 ' 

OR R4,R1 

SUB RIO, #8 (exponent adjust). 


Otherwise, starting with bit 0 and working through bit 6 of R3, 

test each bit for a '1', and increment R12 if not set: 


BBS (R3 ,0) , NORMALIZE 
INCR R12 


a 

BBS (R3, 6) NORMALIZE 
INCR R12. 


3. (NORMALIZE). 

Adjust the final exponent in RIO by the shift as indicated by R12. 
From the number of shifts required, obtain a shift multiplication 
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factor contained in Table SHF$ (R12 ♦ SHF$) and perform the exact 
number of shifts required to normalize R3 and R1 : 


SUB 

R10,R12 

ASL 

R12 

ADD 

R12,SHF$ 

MLU 

R3f (R12) 

MLU 

R1 , (R12) 

OR 

R4,R1 • 


The nomalized fraction is now contaned in registers R4 and R2. 


4. Now align the normalized fraction of R4 and R2 into VAX format 

(mantissa bits only) and clear out the suppressed MSB of the mantissa: 

MLU R2 ,#X' 100' 

MLU R4,#XM00» 

‘ AND R4,#X'007F'. 

R4 now contains the correct high half mantissa bits. R2 and R5 must be "ORed" 
to obtain the low half mantissa bits: 


OR R2.R5 
LR R0.R2. 


R0 now contains the ouput data for the low half VAX format . 


5. The exponent of RIO must be inserted into the high half of the VAX format, 
and the final sign bit of R14 "ORED" in: 


MLU R10,#X' 100' (shift the exponent). 
OR R11,R4 
LR R2 ,R1 1 

OR R2,R14 

RETURN. 


c 
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3.3 MCU SEQUENTIAL ALGORITHMS HOL INTERFACE 


All of the MCU subroutines use identical interface conventions for input and 
output data formats. Each subroutine requires that input data be loaded (from 
MCU memory) into MCU registers R9 and Rll. Upon completion of each 
subroutine, the output function will be contained in MCU registers R2 and RO . 
Error status is returned in R14. 


Because the VAX 32-bit format requires two 16-bit MCU storage locations, a 
".high half" register (which includes the sign and exponent) and "low" half 
register (least significant mantissa bits) will be defined as shown below. 

X input registers: 


Rll - 
High half 
X 


111111 

0123456789012345 


s 

e 

e 

e 

e 

e 

e 

e 

e 

m 

m 

m 

m 

m 

m 

m 


0 

1 

2 

3 

4 

5 

6 

7 

1 

2 

3 

4 

5 

6 

7 













1 

1 

1 

1 

1 

1 


0 

1 

2 

3 

4 

5 

6 

7 

8 

9 

0 

1 

2 

3 

4 

5 

R9 - 

















Low Half 

m 

m 

m 

m 

m 

m 

m 

m 

m 

a 

m 

m 

m 

m 

m 

m 

X 



1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

2 

2 

2 

2 


8 

9 

l_ 

0 

1 

2 

3 

4 

5 

6 

7 

8 

9 

0 

1 

2 

3 
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Y output registers: 


5 


R2 - 

High half 
Y 


I I I 1 I I 


0123456789012345 














1 

1 

1 

1 

1 

1 


0 

1 

2 

3 

4 

5 

6 

7 

8 

9 

0 

1 

2 

3 

4 

5 

R0 - 

















Low Half 

m 

m 

m 

m 

m 

o 

m 

0) 

m 

m 

m 

ra 

m 

m 

m 

m 

Y 



1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

2 

2 

2 

2 


8 

9 

0 

1 

2 

3 

4 

5 

6 

7 

8 

9 

0 

1 

2 

3 


(The symbol, *, indicates the location of the radix point for the 
value stored in a register.) 


R14 ■ Status storage : 


111111 

0123456789012345 


s s 


s s s s 


1 


2 


3 


5 


6 
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The loading of the MCU regiatera upon input, and the eventual storage to MCU 
memory is the responsibility of either a HOL or MCL MACRO, however the 
recommended convention for storage is as follows: 


ADDRESS High Half Data 

ADDRESS+2 Low Half Data 


Note - All MCU registers will be destroyed upon coopletetion of each function, 
with the output and status as shown above. 


Each MCU algorithm description provides the details of error detection in 
section 3.2 . For interface convenience, the specific error conditions have 
been summarized in Table 3.0; normal status is "0", for all functions. 


GOODYEAR AEROSPACE 
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CER-17221 


Table 3.0 MCU SUBROUTINE ERROR STATUS 


FUNCTION ERROR STATUS (R15) Y OUTPUT 


Natural Logarithm....' 1' denotes overflow X'7FFFFFFF' 

'2' denotes underflow X'O' 


'3' denotes X negative undetermined 

(original X) 

Exponential * 1 • denotes overflow X'7FFFFFKF' 

'2' denotes undeflow X'O' 

Square Root.......... *3' denotes negative X undetermined 

(original X) 

• 

Sine.... '3* denotes X > 2**24 undetermined 

« 

Cosine ............... *3* denotes X > 2**24 undetermined 

Arctangent 'O' always Arctangent 


* angular uncertainty net 2 radians 
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4.0 USAGE OF THE MPP SCIENTIFIC SUBROUTINES 


The following sections describe the filenames, library names, and conventions 
for using the MPP scientific functions decribed in this document* The 
information contained in this section describes the configuration of NASA MPP 
disk files in UIC [2,5] • The details of creation of libraries and tasks may 
differ slightly for use with other configurations* 

Table 4*0 contains a list of all PRL and MCL filenames and their global entry 
points for the parallel array routines* The serial MCU routine filenames and 
global entry points are listed in Table 4*1* 

Section 4*1 contains MPP applications programmer information* 

Section 4.2 contains systems programmer information for generating t«.e new VAX 
subroutines and/or libraries from source. 

Section 4*3 describes the MCL macros that exist for the VAX 32-bit format 
functions, and contain programmer information for using the functions directly 
from MCL* 
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4*1 MCU APPLICATIONS 


MCU applications programmers assume the existance of 'PELIER.PST' , (a global 
symbol table that must be used to build MCU tasks) and 'PELIBR.PTK' the 
parallel subroutine library task (that must be loaded into PECU memory to 
provide array functions). These files reside in UIC [2,1]. 

In order to preserve existing MPP applications, the VAX scientific functions 
described in this document have been incorporated into 'PELIBR.PTK'. (See 
Section 4.2 for instructions to include the VAX subroutines). 

Existing array functions have not been affected, and existing MPP applications 
should experience no difficulty in using the new version of ' PELIBR.TSK' . The 
VAX subroutines reside in high PECU memory addresses, therefore entry points 
to previously existing subroutines have not changed. In the event of 
incompatibility with the new VAX version of the array functions, a task build 
of the MCU programs (with reference to the new VAX PELIBR.PST) will be 
required. 

The VAX serial MCU algorithms and the MCU portions of the array functions have 
been included in the existing library file ' [2, 1 JMCLIBR.MOL' (See Section 4.2 
for instructions to include the VAX routines). 

All existing MCU tasks must make reference to 'MCLIBR.MOL' ; therefore, if 
sequential MCU functions or parallel VAX functions are incorporated into 
existing applications, the task build procedure for the MCU portion will not 
be affected. New applications that may include VAX scientific functions, 
should refer to Section 3*5.4 of MPP User's Guide GER— 17141 which describes 
the procedure for building an MCU program. 


4.2 SYSTEM GENERATION and MAINTENANCE 


This Section describes the procedures for incorporating the MPP VAX Scientific 
Subroutines into the MPP system. A complete list of filenames for the array 
subroutines, and MCU subroutines is provided in Table 4.0, and Table 4.1 . 
The files reside in UIC [2,5] of the NASA PDP-11 system disk. 

Generation of the scientific functions presumes the existence of PE Subroutine 
object library [2, 17]PELIBR.P0L and MCU object library [2, 1]MCLIBR.M0L . Files 
are obtained from [2, 1 ]PELIBR.P0L, and files are inserted into 
[2, 1 ]MCLIBR.M0L; therefore, these libraries must exist prior to generation of 
the VAX scientific functions. 
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Table 4.0 ARRAY SCIENTIFIC SUBROUTINES 
Filename References 


PE SUBR 

MACRO 

NAME 

MACRO 

SOURCE 

FILE 

MCU SUBROUTINE 
SOURCE 
FILE 

MCU GLOBAL 
ENTRY 
NAME 

PECU SUBR 
SOURCE 
FILE 

PECU GLOBAL 
ENTRY 
NAME 

LNA 

LNA.MCL 

LNV .MCL 

LN$V 

LNV.PRL 

NRMZV.PRL 

LNV$ 

NRMZV$ 

EXFA 

EXPA.MCL 

EXP V. MCL 

EXP$V 

EXPV.PRL EXPV$ 
EXPSHIFT.PRL EXPSH$ 
EXPUM.PRL EXPUM$ 

SINA 

COSA 

SINCOS 

SINA.MCL 
COSA.MCL 
SINCOS. MCL 

SNCSNV.MCL 

n 

n 

SNCS$V 

it 

it 

VFSC1.PRL 

VFSC2.PRL 

VFSC3.PRL 

VFSC4.PRL 

VFSC5.PRL 

VFSC1 $ 
VFSC2$ 
VFSC3$ 
VFSC4$ 
VFSC5$ 

SQRTA 

SQRTA. MCL 

N/A 

N/A 

SQRTV.PRL 

SQRTV$ 

ARCTNA 

ARCTNA. MCL 

N/A 

N/A 

ATANV.PRL 

ATANV$ 
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Table 4.1 MCU SCIENTIFIC SUBROUTINES 
Filename References 


MAIN CONTROL UNIT SUBROUTINES 


MCU SUBROUTINE NAME MCU SOURCE CODE MCU GLOBAL ENTRY 


LNM 

LNMV.MCL 

LNM$V 

EXPM 

EXPM.MCL 

EXPM$V 

SINM 

SINM.MCL 

SINM$V 

COSM 

COSMV.MCL 

COSM$V 

SQRTM 

SQRTM .MCL 

SQRTM$V 

ATANM 

ATANMV.MCL 

ATNM$V 

COMMON SUBROUTINES 

MULT32.MCL 
NORMV .MCL 
POLY32.MCL 
SHFM.MCL 

MULT32$ 

NORMV$ 

POLY32$ 

SHFM$ 
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Several RSX-11M command files have been written to create the VAX functions 
from source, create intermediate libraries, and to build a new PE subroutine 
library task. These command files may be invoked in the logical order 
necessary to generate all the required files by executing the MCR command: 

MCR> eVAXLIBGEN 

This invokes a command file which prompts the operator to make decisions on 
system generation steps. All, part, or none of the steps may be performed. 
This provides the operator with several options of starting, or terminating 
the system generation of the scientific functions. 

The results of the VAXLIBGEN command file are: 


- that all PECU object files are concatenated into: 

[2.5] VAXPELIBR.P0L which is a new object library; 

- all MCU object files are inserted into: 

[ 1 ,2]MCLIBR.M0L which is an existing object library. 

- a new PECU Subroutine libray task and symbol table are created: 

[2.5 ] VAXPELIBR . PTK which is the task, 

and [2 ,5]VAXPELIBR.PST which is the symbol table. 


The libraries should be maintained with the current versions of the VAX 
scientific functions. 

From the command file VAXLIBGEN, as described above, the final PECU subroutine 
library task and symbol table are created in UIC [2,5] (VAXPELIBR. PTK and 
VAXPELIBR. PST) . 

These are temporary versions of the PECU subroutines that may be loaded and 
tried. After verification, these two files must be re-named and transferred to 
the library UIC [2,1] as PELIBR.TSK and PELIBR.PST to provide the system 
compatibility described in Section 4.1 . 

A seperati command file may be requested to perform this transfer and re-name 
by executing: 
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MCR> #VAXSYSLIB 

Note - This Transfer command file also may be selected as part of the 
previously mentioned VAXLIBGEN command file. 
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4,3 MCU MACROS 


The array functions described in this document must initialize MCU Call Queue 
registers and request either an MCU or PECU program to perform the desired 
function. The operations required to execute array functions are listed in 
Section 2.2 . These operations have also been incorporated into MCL Macros for 
the array functions only. The interface requirements for the serial MCU 
functions are described in Section 3»3 ; since the MCU serial functions 
required only register loads and stores, Macros have not been developed for 
these functions. 

The following Sections describe the Macros that exist for the VAX scientific 
functions. Table 4.0 lists the Macro name for each function. These Macros 
have been incorporated into the the MCL Macro library and may be called by any 
MCL program. 
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4,3.1 LNA - NATURAL LOGARITHM OF AN ARRAY 


This instruction will compute the natural lorarithm of a VAX 32-bit floating 
point format variable In the array, and store the result in the array also 
in VAX 32-bit format. 


Format LABEL } COMMAND J ARGUMENTS 

[a] J LNA I X,Y,E,[T] 


•Label The label field is optional. 

•Command LNA 

•Arguments Three arguments are required. T is an optional temporary storage 
array of at least 56 bits; if T is not specified, the top 56 bits 
of array memory will be used with the LSB at 973* 

•X VAX 32-bit floating point source. 

*Y VAX 32-bit floating point destination where Y = LN(X). 

•E Error bltplane; Set where X was negative, Clear otherwise. 

Y not determined. 


•T Optional parameter specifying a 56-bit temporary storage area to be 

used by this subroutine. If not specified, array memory starting at 
LSB 973 will be used as scratch area. 
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4.3.2 EXPA - EXPONENTIAL OF AN ARRAY 


This instruction will compute the exponential of a VAX 32-bit floating 
point format variable in the array, and store the result in the array also 
in VAX 32-bit format. 


Format 


LABEL I COMMAND I ARGUMENTS 

[s] > EXPA I X,Y,E , [T] 


•Label The label field is optional. 

•Command EXPA 


•Arguments Three arguments are required. T is an optional temporary storage 
array of at least 43 bits; if T is not specified, the top 43 bits 
of array memory will be used with the LSB at 973* 


•X 


VAX 32-bit floating point source. 


•Y 


VAX 32-bit floating point destination where Y = EXP(X). 


•E 


3-bit error array 


E(0) set where output clipped to 0 
because X <2**-31» 

E(1) set where Y overflowed. 

E(2) set where Y underflowed 


•T Optional parameter specifying a 43-bit temporary storage area to be 

used by this subroutine. If not specified, array memory starting at 
LSB 973 will be used as scratch area. 
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4.3.3 SQRTA - SQUARE ROOT OF AN ARRAY 


This instruction will compute the square root of a VAX 32-bit floating 
point format variable in the array, and store the result in the array also 
in VAX 32-bit format. 


Format LABEL | COMMAND I ARGUMENTS 

[s] | SQRTA I X,Y,E,[T] 


•Label The label field is optional. 

•Command SQRTA 


•Arguments Three arguments are required. T is an optional temporary storage 
array of at least 22 bits; if T is not specified, the top 22 bits 
of array memory will be used with the LSB at 973 • 


•X VAX 32-bit floating point source. 

•Y VAX 32-bit floating point destination where Y = SQRT(X). 


*E Error bitplane set where X was negative, clear otherwise. 

•T Optional parameter specifying a 22-bit temporary storage area to be 

used by this subroutine. If not specified, array memory starting at 
LSB 973 will be used as scratch area. 
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4.3.4 SINA, COSA * SINE or COSINE OF AN ARRAY 


These instructions will compute the sine or cosine of a VAX 32-bit floating 
point format variable in the array, and store the result in the array also 
in VAX 32-bit format. 


Format 


LABEL 

[a] 


COMMAND 


SINA 

COSA 


ARGUMENTS 

X,Y,[T3 


•Label 

•Command 


The label field is optional. 
SINA or COSA 


•Arguments Two arguments are required. T is an optional temporary storage 

array of at least 90 bits; if T is not specified, the top 90 bits 
of array memory will be used with the LSB at 973* 


•X 

*Y 

•T 


VAX 32-bit floating point source* 

VAX 32-bit floating point destination where Y = SIN(X) or Y s COS(X), 

Optional parameter specifying a 90-bit temporary storage area to be 
used by this subroutine. If not specified, array memory starting at 
LSB 973 will be used as scratch area. 
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4.3*5 SINCOS - SINE AND COSINE OF AN ARRAY 


This intructions will compute the sine and cosine of a VAX 32-bit floating 
point format variable in the array, and store the results in the array also 
in VAX 32-bit format. 


Format 


•Label 

•Command 


LABEL I COMMAND I ARGUMENTS 

[s] | SINCOS } X,Y,Z t (Tj 

The label field is optional. 

SINCOS 


•Arguments Three arguments are required. T is an optional temporary storage 
array of at least 90 bits; if T is not specified, the top 90 bits 
of array memory will be used with the LSB at 973* 


•X VAX 32-bit floating point source. 

•Y VAX 32-bit floating point destination where Y s SIN(X) » 

•Z VAX 32-bit floating point destination where Z = COS(X) * 


*T Optional parameter specifying a 90-bit temporary storage area to be 

used by this subroutine. If not specified, array memory starting at 
LSB 973 will be used as scratch area. 
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4.3.6 ARCTNA - ARCTANGENT OF AN ARRAY 


This instruction will oompute the arctangent of a VAX 32-bit floating 
point format variable in the array, and store the result in the «array also 
in VAX 32-bit format. 


Format 

•Label 

•Command 

•Arguments 

•X 

•Y 


LABEL 1 COMMAND | ARGUMENTS 

[s] I ARCTNA I X,Y,[T] 

The label field is optional. 

ARCTNA 

Two arguments are required. T is an optional temporary storage 
array of at least 82 bits; if T is not specified, the top 82 bits 
of array memory will be used with the LSB at 973* 

VAX 32-bit floating point source. 

VAX 32-bit floating point destination where Y * ATAN(X) . 


•T Optional parameter specifying a 82-bit temporary storage area to be 

used by this subroutine. If not specified, array memory starting at 
LSB 973 will be used as scratch area. 
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